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Preface 


The primary objective of the course presented here is orientation for those 
interested in applying mathematics, but the course should also be of value 
to those interested in mathematical research and teaching or in using math¬ 
ematics in some other professional context. The course should be suitable 
for college seniors and graduate students, as well as for college juniors who 
have had mathematics beyond the basic calculus sequence. Maturity is 
more significant than any formal prerequisite. 

The presentation involves a number of topics that are significant for 
applied mathematics but that normally do not appear in the curriculum or 
are depicted from an entirely different point of view. These topics include 
engineering simulations, the experience patterns of the exact sciences, the 
conceptual nature of pure mathematics and its relation to applied mathe¬ 
matics, the historical development of mathematics, the associated conceptual 
aspects of the exact sciences, and the metaphysical implications of mathe¬ 
matical scientific theories. We will associate topics in mathematics with 
areas of application. 

This presentation corresponds to a certain logical structure. But there 
is an enormous wealth of intellectual development available, and this permits 
considerable flexibility for the instructor in curricula and emphasis. The 
prime objective is to encourage the student to contact and utilize this rich 
heritage. Thus, the student’s activity is critical, and it is also critical that this 
activity be precisely formulated and communicated. 

The student’s efforts outside the classroom should be mainly devoted 
to a project of his own choice, which he should develop and report to the 
class. A student should have at least three opportunities for such reports. 
See also the comment preceding the exercises of Chapter 1. It is not necessary 
for the instructor to be expert in the topics chosen. The effort of the instructor 
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to understand the means of the student’s presentations can yield excellent 
training. 

The exercises are intended to serve a number of purposes. Reports on 
the exercises can be used to supplement the project presentations by assigning 
questions concerning distinctly different areas. They should also assist the 
student in developing his project either by direct inclusion or by indicating 
various possibilities. In many situations there is need to precisely formulate 
the problem to be solved, and this is represented in a number of exercises by 
some ambiguity, which permits a classroom procedure in which the exercises 
are considered, without previous preparation, to the point of specifying the 
objectives and proposing methods of attack. 

The summaries of sections given in the table of contents are intended to 
assist someone who has read the book to locate specific discussions. They are 
not complete summaries in the usual sense. 

The author is deeply grateful to the students who attended the course 
during its development stages and to Walter Sewell and Y. H. Clifton for 
comment on the text. He would also like to express his appreciation to 
Gwynne Moore, who drew the illustrations, and to Mrs. Ann Davis, Mrs. 
Bonnie Farrell, and Mrs. Anne Tunstall for the preparation of the manuscript. 


Francis J. Murray 
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Introduction 


1.1. Vocational Aspects 

We are concerned with “what is applied mathematics?” and not “how 
to apply mathematics.” In universities, there are many courses that deal with 
the latter question, and whenever possible we will take advantage of this to 
refer to their content. Our initial answer is that applied mathematics is the 
vocational use of mathematics other than in teaching or mathematical 
research, and we will explore the intellectual developments that are associated 
with such vocational uses, with emphasis on the aspects not normally part 
of the “how to” courses. 

There are many vocations in which mathematical procedures form an 
inherent part, for example, physics, engineering, and actuarial practice. 
Both the applied mathematician and the teacher of mathematics should be 
interested in the intellectual basis for this type of mathematical application. 

However, a mathematics student may also be interested in applied 
mathematics as a vocation itself. Our civilization is capable of collective 
actions on many scales, from that of a small manufacturing operation to 
national enterprises such as space exploration, highway construction, or 
food distribution. It is notorious that such collective actions can have both 
desirable and deleterious effects. An ideal scientific understanding of an 
experience complex would permit a quantitative prediction of the effects of 
such actions and possible alternatives. Among the many elements needed to 
obtain such predictions are mathematical analysis and computation. Al¬ 
though the scientific understanding available in practical cases hardly ever 
approximates the ideal, there are vocational opportunities for persons with 
a mathematical background. A mathematics student who is interested should 
obtain an overall understanding of the nature of these applications and in 
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particular of the role of mathematics, a role more subtle than that usually 
ascribed to it. 

The term “applied mathematics” usually implies mathematical and 
logical discussions of considerable sophistication. At present, new develop¬ 
ments tend to take the form of an enormous amount of elementary mathe¬ 
matics in a complex structure and to reflect the availability of automatic 
data processing. But the manipulations of classical mathematics are still 
appropriate, and the applied mathematician must concern himself with both 
types of procedure. To a considerable extent, classical mathematical analysis 
has become part of the professional skill of other vocations, for example, 
electrical engineering and physics. But modern developments tend to go 
beyond the classical limitations of analysis and require novel logical struc¬ 
tures combining manipulation and computation, and it is in these cases that 
the applied mathematician can contribute. 


1.2. Intellectual Attitudes 

The divisions of a university faculty into departments corresponding to 
various disciplines may induce the student to believe that this corresponds 
to some deep resolution of knowledge and understanding into separate and 
indeed disparate compartments. Our education tends to produce the annelid, 
or segmented worm, concept of understanding. For example, a situation 
may first be analyzed from the point of view of economics. This analysis 
leads to engineering problems, which in turn produce problems in physics 
or engineering. The latter then yield problems in mathematics, and these 
refer finally to computation. Thus groups of specialists can each deal with 
problems in their own field, nicely isolated from the others. 

But this type of resolution is essentially inapplicable even when one 
makes the simplifying approximations that are usually necessary in practice. 
The anatomy is invariably far too complex to permit such a dissection; 
indeed the analogy is even unfair to the worm. But complexity itself is not the 
only element involved. The historical evolution of the academic disciplines 
and that of the intellectual formats on which current applications are based 
are quite disparate, and it is pointless to try to fit the latter to a procrustean 
bed of scholastic subdivisions. 

The technical training and specialized skills of the various academic 
disciplines may be appropriate for a specific problem. But they must be 
applied within a framework of general overall understanding and frequently 
in an intellectual format quite independent of academic predilections. The 
actual situation is much more interesting and challenging than that which one 
might, naively, expect. The historical development of quantitative under¬ 
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standing extends beyond recorded history and is fascinating in itself. If this 
development is ignored, technical competence lacks a vital intellectual 
dimension. 

The student may also find it difficult to appreciate that mathematics is 
something we do, not merely something we know. Theoretical understanding 
requires complementary mathematical procedures in order to provide 
guidance for our actions. Originally mathematics was purely algorithmic. 
The texts of ancient arithmetics simply described how to get the answer. 
We shall see that ancient geometry was essentially constructive and that this 
is also true of pure mathematics in the modern sense. The constructive view¬ 
point of mathematics is unifying and counteracts the pedagogical division 
into courses. 

On the practical level, the applied mathematician must always approach 
his problem in a quantitative manner. Since he is part of a cooperative 
effort, he must carefully document all his activities. He must concern himself 
with all relevant circumstances of his problem and its historical development. 
The capability and desire to see a situation as a whole is an essential intel¬ 
lectual characteristic of the applied mathematician. 


1.3. Opportunities in Applied Mathematics 

The Federal Government is, of course, the biggest source of vocational 
opportunities in applied mathematics, although there are quite a number of 
corporation laboratories associated with special service areas such as 
communication or transportation that may also represent vocational 
opportunities, especially in relation to large-scale automatic computation. 

An astronaut has been quoted as saying that when one goes through the 
prelaunch procedure above a million pounds of highly energetic and poten¬ 
tially explosive fuel, it is fascinating to consider that every item in the system 
was obtained from the lowest bidder. The Federal Government has only 
limited production facilities, confined to printing and certain military and 
naval items, and most of the enormous amount of supplies and equipment 
it requires is procured by contract. The larger part of the research facilities 
in the country also are not owned by the Federal Government. Therefore, the 
bulk of research and development needed for federal operations is accom- 
lished by contractors, but the government does require laboratories and 
specialized capabilities to oversee this work. Thus the applied mathematician 
may work on a federally funded project either as part of a contractor’s 
effort or in a government laboratory. 

Research and development is a reasonably stabilized part of federal 
activity, corresponding to about 6% of the total budget. Tables 1 and 2 show 
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Table 1. Expenditure for Research and Development" 


Agency 


Expenditures (millions of dollars) 



Fiscal 

1970 

1971 

1972 

1973 

1974 

1975 

1976 

est. 

Defense 

7,424 

7,541 

8,117 

8,417 

8,791 

9,189 

9,468 

NASA 

3.699 

3,337 

3,373 

3,271 

3,281 

3,181 

3,406 

HEW 

1,235 

1,288 

1,513 

1,791 

1,888 

1,862 

2,423 

AEC 

1,346 

1,303 

1,298 

1,361 




ERDA 





1,475 

1,862 

2,423 

NSF 

293 

335 

418 

428 

571 

571 

602 

Agriculture 

288 

315 

349 

349 

377 

418 

486 

Transportation 

246 

198 

274 

312 

328 

307 

338 

Interior 

153 

175 

210 

235 

202 

265 

307 

EPA 

38 

101 

133 

145 

163 

207 

324 

Commerce 

118 

114 

165 

179 

177 

220 

239 

VA 

58 

61 

66 

75 

80 

97 

99 

HUD 

14 

9 

47 

48 

58 

52 

57 

Justice 

5 

22 

13 

24 

44 

44 

50 

Others 

180 

154 

127 

150 

185 

179 

230 

Research 

5,506 

5,685 

6,169 

6,428 

6,783 

6,355 

7,192 

Development 

9.592 

9.321 

9.934 

10,356 

10,739 

12,344 

13,199 

Total 

15,098 

15,005 

16,103 

16,784 

17,522 

18,699 

20,391 


• Source: Special Analysis. Budqel of United States Government. Office of Management and Budget. Fiscal 
1977. Study P; 1976, Study P; 1975. O; 1974. P; 1973. R; 1972, R; 1971. Q. 


the expenditures in millions of dollars in various departments. Table 3 
indicates the variation over a longer period. 

A major part of the expenditure for research and development is in the 
Department of Defense. There is a continuous development of missile 
weapon systems, military aircraft, submarines and antisubmarine systems, 
land-combat weapon systems, and electronic devices used in warfare. This 
development tries to exploit all available technology and requires continuous 
engineering development, involving simulations with mathematical, logical, 
and computational aspects. Simulations are also used in military planning 
and training. 

The next largest area of research and development is in the National 
Aeronautics and Space Administration with its emphasis on the exploration 
of space and the development of means for this purpose. Air traffic control 
also requires systems with sensing and computational abilities. The Depart¬ 
ment of Health, Education and Welfare is immediately concerned with bio¬ 
logical research but it also needs large-scale information storage and retrieval 
systems. The Atomic Energy Commission dealt with nuclear weapon systems 
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Expenditures (millions of dollars) 


Agency 

Fiscal 

1976 

Fiscal 

1977 

1978 

est. 

1979 

est. 

Defense 

9,329 

10,176 

11,137 

12,315 

Energy 

ERDA 

2,225 

3,181 

3,881 

4,188 

NASA 

3,521 

3,763 

3,824 

4,090 

HEW 

2,566 

2,591 

2,890 

3,141 

NSF 

623 

650 

715 

764 

Agriculture 

460 

516 

604 

582 

EPA 

251 

372 

330 

345 

Transportation 

303 

311 

342 

335 

Interior 

315 

293 

335 

342 

Commerce 

Nuclear Regulatory 

224 

233 

270 

298 

Commission 

81 

104 

125 

145 

VA 

97 

105 

112 

113 

HUD 

54 

66 

54 

54 

TVA 

19 

25 

36 

52 

AID 

23 

38 

35 

48 

Others 

142 

141 

165 

163 

Total 

20,233 

22,462 

24,854 

26,984 


"Source is the same as that of Table 1, Fiscal 1978, Fiscal 1979, Study P. 


and nuclear power systems, and these require engineering with a large math¬ 
ematical component. The energy aspects have been taken over by ERDA, 
which has a somewhat more general responsibility in regard to sources. The 
government is also involved in the planning of air and ground systems of 
transportation and developing more completely mathematical methods of 
weather prediction. Simulations are also of considerable importance to the 
Environmental Protection Agency. 

The ability of a modern data-processing system to deal with an enormous 
amount of information constitutes a challenge to tackle new classes of prob¬ 
lems. One example is the quantitative description of the national economic 
system in terms of the exchange of goods and services. Another example 
would be an information storage system with a flexible retrieval system to 
deal with what is known concerning chemical compounds that can be 
manufactured. Many of these essentially large problems are of national 
interest and hence of concern to the Federal Government; for example, the 
Department of Defense may want to determine the economic capability of 
the country to sustain a certain war effort. One would anticipate that the 
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Table 3. Historical Summary of Expenditures for Research and Development" 


Expenditures (millions of dollars) 
Fiscal - 


year 

DOD 

NASA 

AEC-ERDA 

HEW 

NSF 

Total 

1954 

2,487 

90 

383 

63 

4 

3,148 

1955 

2,630 

74 

385 

70 

9 

3,308 

1956 

2,639 

71 

474 

86 

15 

3,446 

1957 

3,371 

76 

657 

144 

31 

4,462 

1958 

3.664 

89 

804 

180 

34 

4,991 

1959 

4,183 

145 

877 

253 

54 

5,806 

1960 

5,654 

401 

986 

324 

64 

7,744 

1961 

6,618 

742 

1,111 

374 

83 

9,284 

1962 

6,812 

1,251 

1,284 

512 

113 

10,381 

1963 

6,844 

2,539 

1,336 

632 

153 

11,999 

1964 

7,517 

4,171 

1,505 

793 

203 

14,707 

1965 

6,728 

5,093 

1,520 

738 

206 

14,889 

1966 

6,735 

5,933 

1,462 

879 

241 

16,018 

1967 

7,680 

5,426 

1,467 

1.075 

277 

16,842 

1968 

8,164 

4,724 

1,594 

1,283 

315 

17,030 

1969 

7,858 

4,252 

1,654 

1,221 

342 

16,208 

1970 

7,424 

3,699 

1,346 

1,235 

293 

15,098 

1971 

7,541 

3,337 

1,303 

1,288 

335 

15,005 

1972 

8,117 

3,373 

1,298 

1,513 

418 

16,103 

1973 

8,417 

3,271 

1,361 

1,791 

428 

16,784 

1974 

8,791 

3,181 

1,475 

1,888 

571 

17,522 

1975 

9,189 

3,181 

1,862 

1,862 

571 

18,699 

1976 

9,468 

3,406 

2,423 

2,423 

603 

20,391 


"Source is ihe same as for Tabic I with emphasis on Fiscal 1971. Study Q. 


Environmental Protection Agency will concern itself with the overall 
economic effects of various technological restrictions. Medical research is 
frequently dependent on chemical structure information, and one may wish 
to consider all available chemical compounds with a given substructure. The 
analysis of complex biochemicals is in general a large-scale data-processing 
task, and effective understanding of the function of biochemicals is probably 
a larger problem. Modern sensing devices can gather an enormous amount of 
information, which must be automatically processed in order to produce 
maps or crop reports. Novel applications tend to require applied mathe¬ 
maticians when there is no adequate professional discipline available and 
new logical analysis and constructs are required. 

The growth of communication and transportation requirements in 
recent decades has provided considerable incentive to private industry to 
improve technologies and develop new ones. Power requirements have also 
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produced problems in distribution, the quality of the environment, and fuel 
availability. Electronic computation has yielded an entirely new industry in 
the 1950s and 1960s associated with an extraordinary combination of tech¬ 
nological developments. 

In these cases and in civil and mechanical engineering, the fundamental 
improvement in technology may be to a considerable extent independent of 
mathematical considerations, except where more sophisticated physics 
Appears, as in the use of quantum mechanics in semiconductor theory. But 
automatic data processing has made possible a much more complete analysis 
of the situations involved in the introduction of new developments, and this 
analysis is very important to obtain optimal results. Engineering procedures 
were also expanded by the availability of automatic computation to solve 
classical dynamics problems and for the data processing of automatic sensors 
involving radar and infrared. Each of these developments is of vocational 
significance for mathematics. Also experimentation and testing for reli¬ 
ability has tended to become more complex and to make demands that exceed 
the classical procedures of statistics. 


1.4. Course Objectives 

We will develop the concept of vocational applied mathematics and use 
it to obtain insights into other vocational uses of mathematics. The exercises 
that conclude this chapter are an integral part of the course. We will con¬ 
centrate on applied mathematics as a part of engineering effort, and we will 
consider the general character of such efforts and, in particular, the role of 
simulations. We will consider the technical structure of such simulations and 
see that they are part of a format of understanding that has been developed 
in conjunction with the availability of large-scale automatic computation 
since World War II. 

This understanding is essentially based on mathematical concepts and 
procedures. Standard philosophical approaches are not really adequate to 
explain this understanding. The relation of modern mathematics, as it is now 
taught in graduate school, to applications, even the most abstract scientific 
theories, can only be explained in terms of historical development. This 
historical development began with ancient arithmetic and geometry and 
was a complex evolutionary process in which mathematics took on many 
diverse forms that have left their imprint on notation and terminology and 
on the intellectual approach of many who apply mathematics. Our study 
should provide insight into the practical applications of mathematics, the 
exact sciences, and pure mathematics itself. 
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Exercises 

Term Project: Presumably each student has a specific reason for being interested 
in applied mathematics. It is appropriate for him to set up a project dealing with a 
subject matter of his own choice that he will carry out during the term and present in the 
form of reports to the class. The first report should describe the subject matter, the 
reasons for choosing it, and the student’s objectives in the project. Subsequent reports 
could consider (i) relevant organizations, (ii) the role of mathematically based proced¬ 
ures, (iii) simulation structures, and (iv) validation concerns. 


In the following set of exercises, the student may concentrate on an area of interest 
to himself. Reports to the class are very desirable. The student should use the library to 
make contact with fields involved. 


What kind of mathematics is used in the following fields: 

(a) Surveying 

(i) 

Relativity 

(b) Navigation 

(m) 

Astrology 

(c) Classical mechanics 

(n) 

Keeping book on horses 

and elementary physics 

(o) 

Electrical engineering 

(d) Electricity and magnetism 

(P) 

Civil engineering 

(e) Fluid flow and elasticity 

(q) 

Mechanical engineering 

(f) Thermodynamics 

(r) 

Economics 

(g) Quantum mechanics 

(s) 

Linguistics 

(h) Inorganic chemistry 

(t) 

Computer science 

(i) Organic chemistry 

(u) 

Finances 

(j) Astronomy 

(v) 

Agriculture 

(k) Astrophysics 




In each instance, how do you know that your answer is reasonably complete? This 
question can be subdivided and apportioned to the class. 

1 . 2 . Obtain a list of companies that hire mathematicians. Also obtain a list of 
government agencies that hire mathematicians. 

1.3. Obtain a list of references describing mathematical procedures used in applied 
mathematics and normally not requiring large-scale automatic computation. 

1.4. You have a roster of consultants listed by academic disciplines. What would 
be the appropriate disciplines for the following enterprises? 

(a) A national highway system (e) A crime information 

(b) An antimissile missile defense system network 

(c) A light-weight personal armor (f) A cancer research institute 

(d) A sewage-treatment plant 

1.5. What kind of mathematical tables do you know about ? What are they used for? 

1.6. Instead of bunting, a baseball player tries to hit the ball sharply into the 
ground so that it will have a high first bounce. Is this a good idea? Give quantitative 
estimates of the skills and capabilities required. 

1.7. How can a pitcher throw a curve? Estimate quantitatively the various effects 
desired and the required skills and capabilities. 

1.8. In working the exercises of this chapter, how much assistance have you 
received from the text? Why do cars have batteries? 

1.9. What mathematical algorithms are you familiar with? What is each used for? 
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1.10. Describe the mathematical formulation of the behavior of the following 
systems in the appropriate environment: 

(a) A missile guidance system 

(b) An aircraft with VTOL capability 

(c) A warship relative to its commander 

(d) A tank, half-track, or truck 

(e) An antisubmarine-warfare naval complex operating under one com¬ 
mander 

(f) A medical examination center for military personnel used for induction, 
retirement examinations, or eligibility for further or special services 

(g) A multiple-target intercontinental ballistic missile 

(h) A system for testing a medium-range land-based ground-to-ground 
missile 

(i) A fly-by-Jupiter space mission (or a soft landing on Mars or an atmo¬ 
spheric penetration of Venus) 

(j) A space shuttle 

1 .11. Professional prudence requires that tissue removed in a surgical operation 
be examined by a hospital pathologist independent of the surgical team. With associated 
clinical and personal data, the resulting records would constitute a very extensive file of 
medical information. Describe a system to centralize this information and the purposes 
it could serve. 

1.12. How can you estimate the time it will take to answer a problem of this set? 

1.13. Describe the elements that characterize the quality of the environment in 
terms of numerical parameters. What kind of mathematics would be used to describe 
the effects of polluting agents? 

1.14. There are two major types of nuclear explosive devices. Describe the math¬ 
ematical formulation of the action in each case. 

1.15. What kinds of mathematics are involved in the design of nuclear reactors? 
Many interesting and very difficult mathematical problems arise from the effort to 
utilize nuclear fusion as a power source. Describe these problems. 

1.16. How is the traffic situation on a highway system described? 

1.17. How is the service of a telephone exchange described mathematically? 
A long-distance telephone network? 

1.18. What is the mathematics associated with classical (lumped parameter) 
circuit theory? How is this changed to handle integrated circuits? 

1.19. A considerable number of diverse procedures are used to analyze the structure 
of the complex molecules of biochemistry. What is the associated mathematic? 

1 . 20 . Biochemical activity is usually described in terms of chains of reactions with 
“inhibitors" and “activators." How would such a situation be described in mathematical 
or logical terms? How would separation by membranes be taken into account? 

1.21. How can the chemical structure of a compound be described in such a way 
that it can be stored in a computer? If you have stored a list of such compounds, how 
could you test whether a given compound is in the list? How could you retrieve all 
compounds with a given substructure? 

1.22. If the country is considered as divided into geographical areas for the purpose 
of economic analysis, what are the quantitative parameters one would associate with 
each unit? What mathematics would be appropriate to produce a dynamic model of the 
economy on this basis? 

1.23. How would one describe an economic model for international trade? What 
is the significance of export, import, balance of trade, rate of exchange? What relation 





10 


Chap. 1 • Introduction 


would one expect between these variables? What would be the effect of long-range 
discrepancies? 

1.24. Modern mapping is based on aerial photography. Describe the mathematics 
involved. In this process, topographical information is obtained as numerical data before 
it is transposed to the usual chart form. What is the magnitude of the data needed to 
construct, 2 cm to a kilometer, maps of the entire United States if the charts have a 
resolution of 1 mm and the height data a resolution of 10 m? How can this data be 
compressed without loss of information? What approximations may be suitable for 
further compression and how would these be used? 

1.25. What are satellites used for, and what mathematics is involved? 

1.26. What is a power network? What is its purpose and how is it described 
mathematically? 

1.27. How is the logical structure of a computer described? 

1.28. What is the band theory of electrical conduction for semiconductors, and 
what are the associated mathematical questions? 

1.29. What is “operations research"? To what areas has it been applied? 

1.30. The terms “design of experiments" and “analysis of data" can be interpreted 
in different ways. Discuss these for agricultural experiments, weapon-system testing, 
and nuclear particle research. 

131. Modern linguistic theories have introduced mathematical constructs and 
transformations. Describe these. 

1.32. Investing in corporate stocks has various quantitative aspects. Mathematical 
predictions are clearly desirable. How can these be obtained? 

1.33. Betting on the outcome of horse races has various quantitative aspects. 
Mathematical predictions are clearly desirable. How can these be obtained? 

1.34. One presumably mathematical area is called “the numbers." Describe this. 
Discuss the procedures used by the financial interests that run this business to avoid 
catastrophes due to coherent betting. 

135. The term “reliability" is applied in many different circumstances. What are the 
quantitative aspects of the term relative to the following: (a) light bulbs; (b) integrated 
circuits; (c) automobiles; (d) airlines; (e) army rifles; (f) screw-making machines; 
(g) rocket fuel; (h) systems of computer software; (i) dictionaries; (j) scientific theories. 
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/ 


Simulations 


2.1. Organized Efforts 

The work of the applied mathematician is usually part of a relatively 
large effort representing a contractual obligation of his employer. Examples 
are the engineering development of weapons, aircraft, naval ships, land 
vehicles, weapon systems, communication systems, transportation systems, 
computing systems, service systems, and training devices. Government 
requirements in the present day also demand extensive studies in which the 
major aspect is economics, sociology, or biology, as in ecological impact 
studies. Most work of this type is government oriented. 

Such efforts are based on a mathematical formulation of scientific and 
technical understanding. Large-scale computing permits extensive computa¬ 
tion for decision purposes. The correctness of this computation may be the 
immediate responsibility of the applied mathematician, but this cannot be 
isolated from an understanding of the total effort. The mathematician must 
participate in a general team effort, and this requires appropriate communic¬ 
ation in the form of reports and documents. 

The personnel in such an effort usually have a unifying background of 
common experience. Thus, the scientific understanding utilized tends to have 
a specific technical cast based on experience and relevant engineering 
practice. This affects the choice of mathematical procedure as well as the fact 
that more general procedures can lead to severe mathematical difficulties. 

In addition to these large efforts, the mathematician may also be involved 
in certain specific statistical procedures, and he may also find educational 
responsibilities. Statistics may be an integral part of a large effort, but it can 
also be needed in other immediate developments, and the applied mathe¬ 
matician should be familiar with the theory. New technical developments 
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may introduce mathematics that is novel to the people involved, and instruc¬ 
tion may be desirable. 


2.2. Staging 

Economic considerations are always paramount in research and 
development efforts. If a project is sponsored by a government agency, the 
value of the ultimate outcome usually has to be determined. When work is 
done by a private contractor, the resources available are limited and must be 
used efficiently. Resources include money; management and technical man¬ 
power (especially individual capability); production, laboratory, and com¬ 
puting facilities; support personnel and services; and office space. 

Because of the limitations and pressures associated with resources, large 
efforts,’generally, are developed in stages that serve two purposes. Each stage 
either justifies the continuation of the project into the next stage or indicates 
a termination. Funding considerations are often more complicated than go 
or no go decisions. Even when a project is not abandoned, the continuation 
may be postponed or stretched out, and for this reason the stages themselves 
are structured into phases to permit resource management. 

A government project will usually have an “in-house” first stage. A 
project can arise because experience indicates a need, or a technical develop¬ 
ment shows that certain possibilities have become available, or the particular 
agency involved may be subject to external demands. One can consider the 
first phase as consisting of a number of studies that result in a definition of the 
project and a set of objectives or requirements for the outcome. A second 
phase must study feasibility, and when this is completed a decision must be 
made to commit funds for a larger effort. The continued larger effort will 
usually be executed under contract, since the agency does not have the 
required in-house resources to proceed in any other way. The student should 
appreciate that this is a carefully maintained element in an economic and 
political system. The contracts have considerable local economic impact, 
which is politically valuable to congressmen, who in turn support the overall 
program of the agency. 

The stages in the contract work are in general highly visible and are 
clearly associated with go or no go decisions and the possibility of delays or 
stretching out of funding. One would normally expect each stage to be indiv¬ 
idually contracted for in a sequence such that the next commitment can be 
approximately controlled. Variations from this procedure may occur due to 
political pressure or intra-agency reasons. The contracts associated with the 
later stages are more desirable, and there is pressure for early commitment. 
When the project involves basic innovations whose feasibility should be 
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carefully established by the staging process, early commitments can be quite 
unfortunate. 

Suppose we are dealing with an engineering development that will lead 
to the manufacture of a large number of similar units. Clearly this will apply 
to most weapon systems, aircraft, vehicles, and so forth. An engineering 
development whose objective is a small number of large or very large units 
such as naval vessels or, say, satellites, which are large in the sense of technical 
effort, would have a somewhat different staging. There are also intermediate 
developments, but it is desirable to be definite, and we will consider the 
stages for the first type. 

The first contract phase can be considered to be a planning stage, to be 
followed by a prototype stage, a production stage, a fielding stage, a service 
stage with the important element of maintenance, and a phase-out stage. 
Such a development requires a sequence of decisions and an evolvement of 
understanding that will permit progress to the next stage. This understanding 
must have a scientific and technical basis and is usually embodied in simula¬ 
tions. 

To use the available resources efficiently, each stage must be precisely 
structured in time. Thus the planning stage would consist of a feasibility 
phase, a product-design phase, a development plan, a prototype production 
plan, a service production plan, training plans, fielding plans, and service and 
maintenance plans. The corresponding phases could and normally would 
overlap. 

In regard to feasibility, it is usually desirable to expand the previously 
available in-house studies and precisely establish the requirements. Relative 
to requirements, the interests of the government and that of the contractor 
do not normally coincide. A simulation of the expected use of the system, 
including the environment, may be significant for feasibility. Essential 
differences in points of view can be either eliminated by agreeing on the 
technical base of the simulation or brought into sharp focus to fix respons¬ 
ibility. 

In product design, it is usual to set up a simulation of the use of the 
system in various environments on a computer to test whether a tentative 
design satisfies the requirements. In such simulations it is possible to modify 
and adjust the tentative design conveniently and obtain performance and 
cost operation. 

If a system design is available, it is possible to plan the rest of the project 
completely. But for modern production, a very large amount of interrelated 
data is required, such as specifications, blueprints, production scheduling, 
and procurement. A structure for this data may be available, relative to a 
fixed production facility, but the data itself must be produced in a form that 
will permit design flexibility. Thus the planning of the production stages 
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involves considerable data generation and processing, and these procedures 
themselves may be automated and designed. If the production facility has to 
be developed or restructured, one has further design problems. These 
developments involve considerable technical understanding and may require 
simulations. The development of other plans will similarly involve consider¬ 
able data and structuring of procedures and simulations may be utilized. 


2.3. Simulations 

A simulation is defined in the dictionary as a pretense or feigning. There 
is a situation in which the subject of the simulation is supposed to be involved. 
One contrives an imitation of this situation so that the subject has vicarious 
experience with it. One possible purpose is deception, and many forms of 
amusement, from roller coasters to opera, are based on simulations. 

There are, however, technical situations in which vicarious experience is 
valuable. These require that the representation be logically equivalent to the 
original situation. Such simulations can be used for design purposes so as to 
anticipate the capabilities and limitations of a proposed system. They can also 
be used for training purposes to develop the understanding, skills, and proper 
reactions needed to use complex devices. Simulations may be used for plan¬ 
ning or to develop understanding. For example, a logical representation of an 
explosion may indicate phenomena that can be detected experimentally and 
either confirm or deny certain hypotheses. 

It is clear that there are various aspects common to all these simulations. 
To the subject of the simulation, the original situation is reproduced sym¬ 
bolically and a time history of a specific imaginary experience with the situa¬ 
tion is developed. To produce this, the symbols must be activated by some 
process logically equivalent to the original situation, and in technical situa¬ 
tions this is usually a computation, proceeding automatically in a data 
processing system. 

Consider, for example, a flight trainer, which is a device that enables a 
pilot to learn how to fly a specific type of airplane. The cockpit of the aircraft 
is simulated so that the trainee judges the flight from instrument readings and 
moves the airplane controls. This is, of course, the symbolic representation 
of the situation. The computation in the computer correspond to the flight 
determined by the controls, and the various numerical quantities that describe 
this flight are generated as functions of the time and appear on the instru¬ 
ments. The mathematical procedure that is realized by the computation is 
called the math model for the airplane. 

Certain elements are usually present in most computer-activated simula¬ 
tions. The symbolic representation may involve a considerable amount of 
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equipment, which receives output from the computer and produces input. 
The math model must be based on a scientifie-and technical analysis of the 
original situation. Furthermore the math model must be used to produce a 
computer program that will yield a time development corresponding to a 
specific experience. Finally there is usually a requirement for an overall 
evaluation of the total experience with the simulation. 


2.4. Influence Block Diagram and Math Model 

The math model must incorporate the scientific and technical under¬ 
standing of the situation that is simulated. The appropriate technical inform¬ 
ation and experience must be assembled and documented. The initial analysis 
of the situation corresponds to the “influence block diagram.” One deter¬ 
mines the various aspects of the situation that can be specified uniquely in a 
quantitative manner. Usually each aspect is specified by a number of numer¬ 
ical values, and in a specific experience each such aspect will have a time 
history, which is given by expressing these values as functions of the time. 

A block in the diagram corresponds to such an aspect and its related set 
of quantities. The math model gives the mathematical basis for determining 
these quantities as functions of the time. For example, certain of these vari¬ 
ables may change continuously and correspond to the solution of a simul¬ 
taneous system of equations in which time is the independent variable. On 
the other hand, other variables may take on only a discrete set of values and 
hence will change abruptly during a simulation. Such a change is called a 
critical event, and the math model must specify the criteria for critical events. 
In most cases, a criterion of this sort is given by a change of sign of a function 
of the continuous variables. 

These functional relations correspond to relations between the blocks 
of the diagram and may be indicated by connections on the diagram. But 
these connections represent more general relations than the specific mathe¬ 
matics of the math model. A preliminary analysis can perhaps establish these 
relations as a first step in determining the math model. 

Various ways can be used to analyze the situation into aspects corres¬ 
ponding to diagram boxes. It may be possible to consider certain components 
of a device as rigid bodies. For each such component, one has the variables 
that describe the motion. These variables satisfy differential equations con¬ 
taining forces. The forces depend on other aspects of the situation, and this 
dependence corresponds to connections in the block diagram. The actual 
functional relations for these forces yields the math model. However various 
forces or sets of forces may themselves correspond to boxes in the diagram. 
In simulating the flight of an airplane, the aerodynamic forces and torques 
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can be considered such an aspect. These forces depend on the velocity and 
altitude of the aircraft and the position of the control surfaces. 

But analysis may require more sophisticated geometric constructions as, 
for example, in simulations of explosions, combustion, internal ballistics, and 
chemical processes. The situation has a number of regimes of activity in time, 
and during each regime at each instant of time, one has a spatial division into 
regions and separating surfaces. The different regimes can be numbered, and 
this number can be considered a “state variable,” which changes abruptly at 
critical events. Within each regime, the substance in a given region may be 
considered an aspect that has a quantitative description in terms of its motion 
and thermodynamic characteristics. If the substance is not a rigid body, the 
motion may be quite complex. A theoretical basis for describing such motions 
is available in the form of partial differential equations. In general, these 
partial differential equations have no formal solution. Numerical methods, 
using automatic data processing, have greatly extended the range of cases 
which can be approximated by a numerical development in time. However, 
in these cases, one may need to substitute experimentally determined patterns 
of motion for theoretical solutions of the equations. 

The notion of state variable applies to many simulations. For example, 
an aircraft can be in different flight states, such as moving on the ground, 
normal flight, or in a stall or spin. A state variable can also specify whether a 
certain malfunction obtains. In general, one needs to know the values of all 
the state variables to determine the equations that describe the change in 
the continuous variables. Thus, the state variables determine regimes in time 
during which the continuous variables change in accordance with a given 
set of rules until a critical event occurs and the set of rules changes. For 
example, in takeoff, the aircraft moves along the ground with increasing 
velocity until the aerodynamic lift force exceeds the weight. In flight, the 
aerodynamic forces will differ from those on the ground and certain forces 
will disappear. Retracting landing gear corresponds to a change in a state 
variable. 

In general, one would consider each state variable as an independent 
aspect of the simulated situation to be represented by a box. Other aspects 
can be identified as having essentially the same significance for different 
values of the state variables but whose math model may be different. For 
example, the aerodynamic lift force has a math model that is dependent on the 
value of the flight state variable. Because of the complexity of the relations of 
state variables and continuous variables, it is desirable to begin an analysis 
of the original situation with the construction of the influence block diagram. 

The math model gives the change of the continuous variables for each 
possible combination of values of the state variables. If the continuous vari¬ 
ables are specified as solutions of a system of differential equations in which 
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the independent variable is the time, they are said to be given “dynamically ” 
If they are given as functions of time directly, they are said to be given “kin¬ 
ematically.” The manner of change may be determined by scientific principles 
or it may be given by an empirically observed pattern. An example of the 
latter would be the motion of a ship after a rudder setting has been changed. 

It is clear that the determination of the math model on the basis of the 
influence block diagram, scientific principles, and empirically observed 
patterns should be carefully documented. When an organization has dealt 
with similar problems over a period of time, there is usually a traditional 
procedure for determining the math model but complete documentation is 
still essential. Most situations will have individual variations. 


2.5. Temporal Patterns 

The math model specifies a combination of regimes of continuous 
development followed by critical events, which can be diagrammed as a 
“flow chart.” The influence block diagram and the flow chart represent com¬ 
plementary aspects of the math model, one of which refers to the status at an 
instance of time, the other refers to a time pattern of behavior. 

The influence block diagram and the flow chart are logical consequences 
of the math model and are determined when the math model is given. But 
usually one deals with a somewhat more complex situation, in which the 
math model is not given first. For example, in design procedures, one may 
have cases in which the given requirements are time patterns that can be 
incorporated into a flow chart. The objective is to design a system whose 
math model yields this flow chart. One may also have a procedure based on 
experience and yielding a design that can be represented by a influence block 
diagram, but the specific math model may have to be determined. A variation 
of this situation is one in which a general structure for the design is assumed 
and this general structure becomes specific when values are assigned to 
certain parameters called “design parameters.” Here it is usual to set up a 
simulation in which these parameters can be adjusted so as to obtain the 
desired behavior, which may be represented by a flow chart, or to avoid 
certain objectionable characteristics. 

Thus, while the math model logically determines the influence block 
diagram and the flow chart and the computational procedures associated 
with individual experiences, this is not necessarily the order in which they 
arise in the analysis of simulations. The math model, the influence block 
diagram, and the flow chart form a logical and symbolic entity that is 
valuable for engineering and many other situations. 

A flow chart can be considered to have a network or linear graph char- 
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acter. We can draw a flow chart as a linear graph in which a regime corres¬ 
ponds to one cells and critical events to end points, or as a network with 
regimes corresponding to branches and critical events to nodes. But the 
regimes are directed, of course, and the choice of the new regime that follows 
a critical event may be determined by something not represented on the flow 
chart. Nevertheless, any specific time history corresponds to a path on this 
flow chart. A permissible path of this type is called a “scenario/’ 

These general schematic ideas apply even to simulations that are not 
engineering in character. An example would be a computerized war game 
whose purpose is to train staff officers to handle problems of manpower, 
equipment, petroleum supplies, and other requirements of a military cam¬ 
paign. In the initial planning for such a simulation, a specific time history, or 
scenario, is determined. During the game the computer must produce a 
running account of the above military essentials, taking into account the 
attrition due to the campaign itself and to enemy action and various factors 
such as terrain, and also taking into account the actions of the trainees. 
These elements appear in the influence block diagram, and the math model 
is based on them using statistics acquired from military experience. Notice 
that here the natural order is scenario, block diagram, and math model. 

A war game whose purpose is to test a plan or to develop skill in tactics 
will have a somewhat different development. The initial effort usually would 
be concerned with the influence block diagram. The opposing forces would 
be resolved into groups whose status can be quantitatively described and 
treated as a unit relative to combat and mobility. These units interact in 
various ways, such as by fire power, losses, terrain, map position, and logistics. 
These relations are relatively complex and are perhaps analyzed best initially 
in diagrammatic form, and the math model is based on this diagram. The 
limitations of the analysis result in a procedure in which values for manpower, 
fire power, vehicles in service, ammunition, and supplies are extrapolated for 
brief intervals of time. The math model must provide the rates for these 
extrapolations in terms of the variables. The simulation is essentially dynamic. 


2.6. Operational Flight Trainer 

To illustrate the ideas associated with technical simulations, in par¬ 
ticular, block diagram, math model, flow chart, and scenario, we consider an 
operational flight trainer (OFT). Such a device is used either to train novices 
to to familiarize experienced pilots with a new aircraft. Our discussion will 
be simplified to the greatest extent consistent with our purpose. 

The very interesting technical history and characteristics of OFTs are 
described in the U.S. Navy, Commemorative Technical Volume. {2) An introduc- 
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Figure 2.1. Operational flight trainer. 


tion to the aerodynamic aspects and equations of motion is given by 
Connolly.' 3 ’ 

We suppose that the purpose of the trainer is to provide familiarity and 
understanding with powered flight and develop skill patterns for novice 
pilots. The trainer will consist of two portions. One part (see Figure 2.1) is a 
cockpit with instruments and controls and a display to provide cues needed 
for takeoff and landing. The other portion of trainer involves data processors 
that contain a mathematical simulation of the aircraft flight determined by 
the cockpit controls. This simulation governs the cockpit instruments and 
provides input for the activation of the display. The servo systems that 
position the control surfaces on the aircraft are also simulated. 

Our immediate concern is with the flight simulation, and this is based 
on Newtonian mechanics. However, the detailed analysis must be based on 
the flight phase, since that determines the forces and torques. The normal 
sequence of flight phases is (1) takeoff one, (2) takeoff two, (3) normal flight, 
(4) glide, (5) landing one, (6) landing two. However, of these phases, (1), 
(2), (5), or (6) could end in disaster, for example, if a wing scraped the ground. 
On the other hand flight or glide could end in a stall, spin, or tumble. But in 
both the disaster case or the cases of abnormal flight, the simulation must 
produce appropriate output. The trainee may also be capable of producing a 
return to normal flight from the abnormal case. We have, therefore, a finite 
set of possible flight phases and the dynamic analysis is determined by the 
phase. The simulation in each phase ultimately leads to a change of phase. 
We can, of course, introduce a discrete variable to specify the flight phase. 

Thus the simulation of flight box in the above can be resolved by intro¬ 
ducing a flight phase box and a number of subsidiary boxes (Figure 2.2). The 
significance of the double lines is that while the flight phase determines the 
analysis to be used, i.e., the subsidiary boxes, the development in time in each 
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Figure 2.2. Resolution of the flight phase. 

case can lead to a change of flight phase. We will discuss this in more detail 
later. 

We will concentrate our attention on the subsidiary box “normal flight.” 
The two outputs of the simulation of flight are the instruments and the display 
activation. The landing display is determined by the relative position of the 
aircraft and the landing field. There are two classes of instruments, flight 
instruments and engine instruments. Basic flight instruments are the alti¬ 
meter, turn-and-bank indicator, artificial horizon, airspeed indicator, 
rate-of-climb indicator, gyrocompass, and magnetic compass. These instru¬ 
ments indicate to the pilot the aspect of the plane and its altitude. The student 
is trained to fly the plane relative to these instruments, which are more 
sensitive than direct observation. Examples of engine instruments are the 
tachometer, fuel pressure gauge, oil pressure gauge, carburetor air intake 
pressure gauge, manifold pressure gauge, torque meters or output horse¬ 
power, and temperature gauges for air intake, oil and fuel, and the cylinder 
heads. The precise operating status of the engine is determined by adjusting 
fuel intake and air intake and is indicated by the tachometer, output horse¬ 
power, manifold pressure, temperature, etc. From the point of view of overall 
flight, however, the two critical outputs are the thrust of the propellers and 
the consumption of fuel. 

In normal flight the aircraft is subject to gravity, the thrust of the engine, 
and the action of the air. One considers the airplane as a solid body consisting 
of the fuselage, fixed wing surfaces (the empennage), and control surfaces. For 
the purpose of analysis, the latter are considered to have fixed angular rela¬ 
tions relative to the fixed elements at any instant of time, although this rela¬ 
tive position is actually controlled by the pilot. According to Newtonian 
dynamics, the forces acting on the aircraft produce accelerations that can be 
integrated to yield the motion and position and aspect of the aircraft. Thus, 
we have a rough tentative diagram for the flight box (Figure 2.3). 

We must specify the position and motion of the aircraft by means of 
certain variables. There are various ways in which the motion of a solid body 
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in space can be described. Our method contains some redundancy but has a 
compensating flexibility, which is useful. We begin by choosing a coordinate 
system (the “fixed” system) that is fixed relative to the earth. We will assume 
that this fixed system is an “inertial frame,” that is, a coordinate system in 
which Newton's law, force=mass x acceleration, is valid; although this is not 
strictly true because of the rotation of the earth. We also choose a coordinate 
axis system fixed in the aircraft. It is customary to choose the first, or x-axis, 
along a longitudinal axis of the aircraft with positive direction forward. We 
choose the second, or y axis, to be horizontal in normal flight, and the third, 
or z axis, to be vertical. This coordinate system is called the “body” system. 

Let u = (u\u 2 ,u 3 ) be the displacement vector of the origin of the body 
system, expressed by components, in the fixed system. During the flight each 
component, u 1 , u 2 , and u 3 , will be a function of the time. Let i,=(i},i 2 ,i 3 ) 
be the unit vector in the positive direction of the first axis of the body system, 
expressed by components in the fixed system and let l 2 =(ii, *|. i|) and 
j 3 _(i>, if, i\) be the corresponding unit vectors along the second and third 
body axes. During flight, the components i*, r, s= 1, 2, 3, are also functions 
of the time. 

The twelve quantities, £, are adequate to describe the position of the 
aircraft: For if A is any point in the aircraft, it will have three constant coord¬ 
inates (x‘,x 2 ,x 3 ) in the body system, and if (s‘,s 2 ,s 3 ) is the corresponding 
set of coordinates in the fixed system, one has 

(s‘, s 2 , s 3 )=(u‘, u\ w 3 ) +x‘i, +x 2 i 2 +x 3 i 3 

or s= u + xi when one considers s. u, and x as one-rowed matrices and i is the 
3x3 matrix with rows corresponding to the i r vectors. We will now drop the 



Figure 2.3. Influence block diagram. 
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use of boldface for vectors and matrices and write s=u+xi. Since x is a 
constant, this equation describes the position of A when the components of 
u and / are given as functions of the time. 

Since x is a constant, we have 


dt dt dt 

We must express di/dt. Let the prime denote the transpose. Since i is an 
orthogonal matrix, we have ii' = 1 and hence 


di . di' di 
0 =-j- *+1—=— i+ i 
dt dt dt 



by the well-known properties of the transpose. Let Cl=(di/dt)i'. This equation 
can be written 0=fi+Q / , or ft is antisymmetric. Hence, Q can be written 


■ 0 

"12 

"l 2 

0 

- "31 

~"23 


~"31 

"23 

0_ 


We also have di/dt = Qi. If we let to denote the vector (to 1 , to 2 , to 3 ) with 
"’ ="23. " 2 = " 3 i, oj 3 =co, 2 , then 


xn=(x‘,x 2 ,x 3 ) 


CO 3 —w 3 ' 


—to" 


to 2 —to 1 


to 

0 


= (" 2 X 3 -co 3 x 2 , to 3 x l — co*x 3 , a»‘x 2 — cu 3 x‘) 
=wxx. 


We have 


ds _ du di 

dt~Jt +X 7t 


= Y t + = ^ + (*Q)» = ^ + (" x x)i. 


and, for the acceleration, 

d 2 s d 2 u (dco\ 

d?=d? + y7 x x / +[w x (co x 

The velocity of any point can be expressed in terms of du/dt and to. We can 
also express the velocity as a vector in the body system by means of v, defined 
by du/dt = vi and ds/dt=(v+toxx)i. Hence the position and motion of the 
aircraft can be expressed in terms of u, i, v, and to. 

The Newtonian dynamics of a rigid body is usually based on the assump¬ 
tion that the body consists of a large number, n, of particles small enough so 
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that each can be considered located at a point. Let particle a have position 
vector s a =(si, s 2 , s 3 ) in the fixed (inertial) system, mass and be subject to 
a force F m which is the resultant of all the forces acting on the particle. Thus, 
F„=m,d 2 sjdt 2 . 

If s 0 is the position of the center of gravity, then Z x m a s a =Ms 0 , where 
M = m x is the total mass and Z* m x (s x -s 0 )= 0. If we sum over all particles, 
we obtain, 

Fr = 


d 2 s. 


,d 2 s 0 




In summing the forces, the forces between particles cancel, so that F R is the 
resultant of the external forces applied to the set of n particles, that is, to the 
body. This equation can also be written 

,,[dv duo . .1. 

= M lit + ~dt x x° + " xt, + " x (" x x°) i. 


where x 0 is the center-of-gravity displacement vector in the body frame. If 
G„ is the vector expression for F„ in the body frame, F„ = G R i, and we have 


Vdv dto .I 

t = M —+—xxo + wxu+rux(tuxx) . 


We also have 


and 


r m * F _ m ( d 2 s '_ d S\ 

F '-M F *' m *\di T dt 2 ) 

( F> - B f ») ><,s -' s « ,=sm -(S r ‘ ■^) x<s '' s ° l 

We sum over the particles and obtain, since Z a m a (s a —s o )=0, 

Here again, when summation occurs, the torques due to internal forces cancel. 
Let G a be the expression of F a in the body axis; i.e., F a =G a i, and recall 
s = u + xi, ds/dt = du/dt+((i)x x)i. We obtain 

X Gj x (x a -X 0 )i =^|x rn x [to X (x a -x 0 )«] x (x. -x 0 )tj. 
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and since the cross product is invariant under an orthogonal transformation, 
J t (jl "»«[> x (*«-*<>)] X (*. -= [lG.x (x a - x o) ] I 

= (ZG.XX.-G,x^, 

= (T° —G r x x 0 )i. 


Here T° is the resultant of the external torques expressed in the body system. 
The vector 


p = X x - ■*<>)] x (*« - *o) 

at 

- X m A Ui x *») x x„ — M(w x x 0 ) x x 0 

at 

is termed the angular momentum, and in the limit associated with taking the 
number of particles indefinitely large, one has 


P = 



c>(a) x x) x x dV — M(co x x 0 ) x x 0 


where <5 = <5(x 1 , x 2 , x 3 ) is the density and the integration is over the volume 
of the aircraft. P is a vector that depends on the vector a> by a transformation 
given by a matrix J called the moment of inertia. The formula (axb)xc = 
(a • c)b—(b • c)a permits one to express J in terms of the matrices J 0 and X 0 
and a scalar j 0 defined as follows: 


Jo=Uo\ 

where 

*o = (.vSx s o) 
and 

j°~ JJJ* <5[(x* ) 2 +(x 2 ) 2 + (x 3 ) 2 ]<fo-M[ (xi ) 2 + (xg) 2 + (x 3 ) 2 ]. 
Then J = J 0 — MX 0 — jo and P = (oJ. Our previous result becomes 


— (coJi) = (T°-G R xx 0 )k 


or 


T do) 1 

— J + Q)X(ajJ) / = (T°— G r X Xq)L 

This yields a relation to specify da)/dt. 
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Thus Newtonian dynamics yields a set of equations to specify du)/dt, 
dv/dl . du/dt, and di/dt, i.e., 


— J + tux(wJ)=T°-G R xx 0 
dt 


/r = m L 7t + li x 


x 0 + wxi; + wx(wx x 0 ) 


du 

~T = VU 

dt 


^ = Oi 
dt 


if one knows the values of T°, G R , x 0 . and of course, ft), v, u. and i. This is 
then a set of differential equations that gives the time rate of change of the 
eighteen components of w, v, u, and i in terms of themselves. 

There are numerical procedures, which, when the value of w, v, u, and i 
are given at a time f 0 ' y> e ^ the values of these same quantities at times 
t 0 + lu fo + 2 //, f 0 + 3 /i, ... with high accuracy if a sufficiently small value of h 
is used. This is termed step-by-step integration with step h. 

We can now complete our discussion of two of the boxes in the block 
diagram of Figure 2.3 by stating the associated variables and how they are to 
be determined mathematically as functions of the time. The variables for the 
Newtonian dynamics are the derivatives dui/dt , dv/dt, du/dt , di/dt , and we 
have equations of these in terms of T°, G R , and x 0 , which presumably will 
come from other boxes. The variables for the position and motion box are 
u, /, u, and o>, and these are to be obtained by numerical integration of the 
differential equation system. 

There are three major types of forces acting on the body. There are three 
contributions to G R . One of these is gravity, which in the fixed system is a 
vector straight down. However, in the body system it is a vector g , which 
satisfies the equation 0 i = M(O,O, 0 ). The contribution of gravity to T°-G R 
xx 0 cancels out and can be omitted from T° if it is omitted from G R xxq* 

The thrust due to the engine is proportional to the power output, and 
the computations needed to specify the latter in time are quite complex. But 
we can, for our purposes, assume that the power output is a function of the 
altitude, the fuel-feed setting, the air-intake setting, and the speed of the 
aircraft. We can consider the thrust as a vector in the form 


G = C(u\ v , ./;jaXl,0, 0); 

i.e., the propeller axis is parallel to the x axis of the body system. If this axis 
does not pass through the center of gravity, there will also be a torque around 
the y axis due to the engine thrust, T e . The fuel consumption is also an output 
from the engine box, and the position of the center of gravity is determined 
by the amount of fuel available. 
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It remains to consider the aerodynamics. If we consider the atmosphere 
as still and the airplane as moving through it with velocity v , the aerodynamic 
forces presumably are the same as in the case in which the aircraft is in a fixed 
position and subject to a wind opposite in direction to v. Thus, the aero¬ 
dynamics are determined by the vector v. In normal flight, v is close in direc¬ 
tion to the positive first axis of the body system, and its position can be spec¬ 
ified as follows. Take a vector of the appropriate length along the positive 
first axis. Rotate this vector by an amount a around the second axis in such a 
way that a small positive a yields a negative third-axis component. This a 
rotation can also be considered as a rotation of the body-system axes. One 
follows this by a rotation of an amount p about the new position of the third 
axis. This specifies the position of v in terms of the angles a and p , which are 
small in normal flight. 

The aerodynamic forces and torques are determined empirically in a 
wind tunnel. In this device, the “wind” is fixed in direction and the instrument¬ 
ation is also fixed. A model of the aircraft with the first body axis directed 
positively into the wind and the second body axis horizontal determines a 
system of coordinates called the “wind system.” Measurements are performed 
corresponding to various values of a and p by rotating the model inversely 
to the above procedure for locating the vector v. However, the measurements 
are still in the wind axis system and to yield vectors in the body axis system 
one must perform a rotation with matrix h = /i(a, P). Thus if H is a vector in the 
wind axis system and G the corresponding vector in the body axis system, 
one has Hh = G. 

The measurements yield approximate expressions for the aerodynamic 
forces and torques in a number of variables. One needed quantity is the 
density of air, p. Within the practical limitations of the current problem, p 
can be considered to be a function of the altitude, u 3 . The density of air 
yields the speed of sound and Mach number, Ma, which is the ratio of \v\ to the 
speed of sound. The aerodynamic forces are also dependent on the values of 
the angles of the control surfaces, <5 R , <5 B , S E , <5 a , for rudder, dive brakes, 
elevators, and ailerons, respectively, and on the slot position variable <5 F . 
Let S denote the wing area, b the wing space, and c the mean aerodynamic 
chord. 

The three components of the aerodynamic force, H a , Hi , Hi are rep¬ 
resented by formulas of the following type: 

Hi =±pv 2 S[_C^ Ma )+CAp)+C xF d F + C*6 B ] 

Hi=WS(C yll p+C y M 

Hl=\pv 2 S\C z ( a, Ma) + C 2 E^E + G z f<5f] . 

The functions C z (a, Ma) and C x (a, Ma) are the empirically determined lift 
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and drag functions, respectively. For Ma fixed and relatively small values of 
a, C 2 (a, Ma) is an increasing function of a but reaches a maximum and 
abruptly drops, corresponding to the condition of stall. 

The aerodynamics torques are quite similar: 

Sl =WbS(C,*d A + C„,p + C, R S*)+ipvSb 2 (C IP \P\ + C lw sw 3 ) 

Si=\pv 2 cS[C m (ct, Ma) + C mE <5 E + C^f] + \pvc 2 S(C mi y. + C mM ..w 2 ) 

Sl = \pv 2 bS(C„^ + C„ R 6n + C^d^+frvtfSlC^w 3 + C„ P |P|). 

From these, the corresponding quantities in the body system are obtained 
by means of the matrix h\ i.e., tf a /i = G a , S a /? = 7^. 

Thus, the resultant force in the body system is given by G R = p + G c + G a , 
and if g is omitted in G R xx 0 , one can use T R = T a + T c . The lift term cor¬ 
responding to C z (a, Ma) must balance g essentially. However, associated 
with the lift, there is a drag term C x (a, Ma) and a moment around the y axis. 
The engine thrust must balance the drag. To counteract the moment around 
the wing, the stabilizer or horizontal tail surface produces a counter torque. 
The resultant of these two torques is represented by the C m (a, Ma) term in the 
Sl expression. Thus, if lift equals gravity, engine thrust equals drag, and 
the turning moments cancel, one can have straight horizontal flight at 
constant speed with the control surfaces in neutral position. A rotation 
around the first body axis is called roll. The ailerons are used to counteract 
unwanted roll and to permit roll adjustments. Turns in a horizontal plane 
involve the rudder to a certain extent, but mostly a roll adjustment so that 
the lift has a horizontal component in the direction of turn. To ascend, extra 
lift in the Hi component is provided by the elevators and flaps or slots. These 
also produce torques around the second body axis (y axis). A rotation around 
the y axis is called pitch and one around the z axis is called yaw. 

The purpose of the various coordinate systems is now clear. The inertial 
system is required to establish the position of the plane and to apply Newton's 
laws. In the body system, the moment-of-inertia matrix is constant, and 
integration is performed in this system. The aerodynamics is given in the 
wind system. 

We are now able to complete the normal flight block diagram and, by 
implication, the math model (Figure 2.4). This is to be considered as part of 
Figure 2.2. In this block the motion and position develop according to the 
system of differential equations given above. We have other systems of 
differential equations for the phases in the usual flight sequence. For example, 
the first takeoff phase in which all the wheels are on the ground and the 
second takeoff phase in which nose wheel is off the ground. One can also have 
a disaster if the wing scrapes the ground in takeoff. This must also be con¬ 
sidered a flight phase and simulated. 
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Figure 2.4. Normal flight. 


Thus, the flight of the aircraft can be described by various regimes of 
development, governed in most cases by systems of differential equations. A 
change of regimes corresponds to a change of state variables. Such a change 
will occur when a variable or combination of variables satisfies a relation or 
inequality. For example, normal flight can end in a stall if a becomes too 
large, or takeoff two becomes normal flight when u 3 reaches a certain value. 
Thus, each regime proceeds until such a criterion is satisfied. 

The controls in an aircraft cockpit operate power systems that position 
the control surfaces. These power systems function against aerodynamic 
forces depending on the plane's velocity, altitude, angular motion, and trim. 
Thus, the angular positions of the control surfaces are governed by differen¬ 
tial equations, which, in general, must be integrated numerically. 

The display system is concerned with landing and needs only to repro¬ 
duce effective cues such as the edges of the landing field. One may wish, 
however, to add background details. In general, the information to be 
displayed corresponds to points or lines fixed in the inertial coordinate 
system. One possibility for the display is to assume that it corresponds to a 
plane fixed relative to the body axis system, perpendicular to the x, or first, 
axis and, appropriately, in front of the aircraft. For each object or point to 
be displayed, one must determine where the line of sight from the pilot to the 
object or point intersects the display plane. This is an interesting mathemat¬ 
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ical relation to work out. One would normally expect all such computations 
to be executed in a display computer that receives u and i from the flight 
computer. 


2.7. Block Diagrams 

Originally, the term “block diagram" referred to the description of the 
relation between the various pieces of equipment of a complete device. Each 
component is indicated by a box, and the interconnecting lines represent 
either a mechanical connection such as a rotating shaft or an electrical 
connection. When a simulation is embodied in a complete device, as, for 
example, in a flight trainer, the term is still used in this way. If one of the 
ultimate aims of an engineering project is the development of a physical 
system or device, an essential output is a block diagram in this sense. We will 
refer to this as an equipment block diagram. 

The concept of an “influence block diagram" for analyzing the situation 
to be simulated arose in the use of analog equipment for simulation purposes. 
In an analog simulator each mathematical operation is represented by a 
special component, and each formula or computational procedure is rep¬ 
resented by equipment in the same rack or associated racks. Thus, if the 
flight of an airplane is to be simulated, one would have groups of racks 
corresponding to the computation of the aerodynamic forces on the plane 
or the action of the engine or of the kinetic integrations and changes of 
coordinate systems. For a discussion of analog computation, see Murray.* 5 * 

Correspondingly, if one is planning to set up such a system, one would 
analyze the original situation into elements that could be represented by 
coherent blocks of computation in the analog sense. The relationship of these 
elements can then be described by a block diagram that is analogous to an 
equipment block diagram. The use of block diagrams in this way is an effec¬ 
tive method for carrying over past experiences with similar problems and 
incorporating improvements in understanding. 

When digital data processing is used instead of analog processing, blocks 
of coding in the machine program replace combinations of analog equipment. 
In programming, it is desirable to retain this resolution of the program in 
coherent blocks, i.e., the ‘‘modularity." Thus, one obtains program block 
diagrams, corresponding to the influence block diagram of the original 
situation. This digital programming block diagram has connections cor¬ 
responding to transfers of data, and this is similar to the analog case. It is, 
of course, different from the flow chart in digital programming, which 
indicates sequential relations in the execution of the program in time. The 
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greater flexibility of digital programming permits wider-ranging simulations 
so that the possibilities for influence block diagrams are greatly expanded 
and more complex. However, the logical possibilities for the influence block 
diagram must be limited to the requirements by practical considerations, and 
it is desirable to remain as close as possible to previous experience. 

We have indicated the role of tradition in setting up the influence block 
diagram. However, the simulation must also satisfy certain objectives, and 
this shows the importance of a precise requirements document. A simulation 
should not be more complex than that required by the objectives. This is 
generally reinforced by limitations on available resources for the simulation, 
such as manpower, computation, and total project time (i.e., deadlines). 

For these reasons, therefore, the influence block diagram will represent 
compromises in the amount of detail represented and the completeness of 
the quantitative description of the elements. It is in general illuminating to 
discover whether the local simulation tradition had an analog epoch. We 
notice two other uses of the term “block diagram,” i.e., equipment block 
diagram and program block diagram. 

The format of block diagram, math model, flow chart, and scenarios 
appears quite abstract. But actually, this represents an engineering tradition 
that originated in modeling in the literal sense. Patent procedures at one time 
were dependent on “working models,” and models are still used for ships, 
public works, and to a certain extent aircraft. The differential analyzer 
permitted one to replace an actual model by a configuration of computing 
components that solved the differential equations satisfied by the original. 
The math model for this situation was, of course, the system of differential 
equations. 

The idea of a mechanical differential analyzer was proposed by W. 
Thomson (Lord Kelvin) (6),(7) in 1875, but there were technical difficulties 
that were overcome only in the late 1920s. These devices came into use in the 
decade before World War II (see Bush (1) ). Electrical components for analog 
computation were developed in World War II by Bell Laboratories. Elec¬ 
trical differential analyzers were much easier to program, and abrupt 
changes such as those corresponding to a change of flight phase could be 
introduced by relays. 

The ability to solve differential equations permitted one to replace 
models for engineering purposes by models consisting of computing com¬ 
ponents, which were more readily constructed. Furthermore, variations in 
design and environment could be tested in an inexpensive fashion. The use 
of analog differential analyzers continued over a period of possibly thirty 
years and resulted in a general acceptance of analogy based on mathematical 
equivalence. By the middle sixties, digital data processing was capable of 
competing with analog equipment on a cost-effectiveness basis and the 
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logical flexibility of digital equipment permitted a considerable expansion 
of the simulation field. 

The use of diagrams to describe relatively complex situations is not 
confined to the examples quoted above. An electric circuit diagram is 
essentially an “influence block diagram” with special rules for determining 
the math model. Complex mechanical devices can be analyzed by the equiv¬ 
alent of a circuit diagram as well as electromechanical systems. One very 
striking type of diagram is that which is used to indicate the sequence of 
chemical reactions that are associated with certain activities of living cells. 
This type of diagram has blocks for chemical reactions and connections for 
the chemicals themselves. Similar diagrams are applicable to industrial 
processes. With the current interest in ecology, diagrams associated with the 
biological environment are often used, and diagrams are obviously useful in 
economics and sociology. In general, in a diagram the blocks should have the 
same significance as should the connections. 


2.8. Equipment 

An engineering simulation must be developed in phases that may overlap 
but that will permit the orderly use of resources. The first phase must involve 
the analysis of the original situation and yield the math model that governs 
the time history of the vicarious experience. Another phase must deal with 
equipment—both the data processing system and that which interacts with 
the subject. A third phase is programming, and a fourth phase consists in 
integrating the program and the diverse equipment into an effective device. 
The fifth phase is the operational one that will lead to the appropriate 
conclusions. 

In practice, one can divide the equipment into three systems, i.e., data 
processing; an input system, which produces data for the data processing; 
and an output system, which is controlled by data processing. In certain 
cases all three systems may be large. We will refer to the data processing 
system as the “computer” but, in addition to the central processing unit and 
storage hierarchy, it will usually contain peripheral equipment such as 
hard-copy printers, card readers, and card punches. 

The high speed of electronic circuits favors the dynamic type of simula¬ 
tion in which a complete cycle of computation is repeated many times a 
second. Many of the numbers are intermediate, and a smaller number 
uniquely determines the time history. Similarly electronic displays such as 
television pictures or cathode-ray-tube outputs generate complete pictures 
or “frames” thirty times a second. Most of this information in the display 
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does not vary from frame to frame. Such displays normally have their own 
store of information associated with them and with it a specialized computer 
to incorporate the new information as required. 

Thus, the calculation and the display storage are two disjoint banks of 
information with a special problem of intercommunication. The communica¬ 
tion process is itself demanding in logical capability, and the computer is 
used for both computing and communication. The two processes are inter¬ 
laced in time. 

In addition to the above displays, mechanical motion may also be desired 
as output. An electrical signal from the computer is used to produce this 
motion either directly, as in meters, or through amplification to attain the 
required power. When such an electrical signal has a continuous range 
(e.g., a voltage) it is called an “analog” signal. Normally a special “interface” 
with the computer is needed to produce the set of analog outputs required 
during each cycle of computation. In the interface the continuous signal is 
produced from a digital value obtained in the calculation. Audio effects may 
also require analog signals for control. 

The requirements on the input system are similar. Thus, the various 
controls on a flight trainer have continuous ranges, and the value in each 
case must be expressed in binary digital form and introduced as such into the 
computer. A simulation may require an enormous amount of input informa¬ 
tion in total, but only a small fraction of it at any specific time. A simulation 
may require extensive map information for its entirety but only local map 
information at any instant of time. The amount of information may require 
that it be stored outside the direct-access memory of the computer. Under 
these circumstances one needs an additional computer to select and transmit 
the immediately needed information. This additional computer receives 
guidance information from the main computer. 

The second phase, then, consists of decision on these three systems in 
order to attain the objectives of the simulation. The objectives determine the 
output displays and analog input and output requirements. The precise 
choice of input and output equipment must also take into account avail¬ 
ability, reliability, and company policy. The decision on input and output 
equipment determines precisely the input data available for the computation 
and the output data required from it. This information and the math model 
specify the amount of computing capability required for the data processing. 

The choice of the equipment can be represented by an equipment block 
diagram in which the connections indicate a transfer of data and can be 
associated with data communication rate. If in addition we associate with the 
block for each piece of equipment a data processing capability, we can con¬ 
sider that the equipment block diagram is also a data flow diagram. This 
constitutes a diagrammatic summary of the second phase. 
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2.9. The Time Pattern of the Simulation 

The third stage in the development of the simulation is programming. 
This is an engineering effort of some magnitude and must be carefully 
phased. A possible phasing would be (a) requirements, (b) structuring and 
specifications, (c) coding, and (d) testing. 

The requirements of the program have two interacting aspects, one 
generated by the math model, the other by the data flow needs of the equip¬ 
ment. These aspects are reconciled by determining (a) the time pattern for the 
computation, (b) accuracy and resolution specifications for the computing 
procedures to realize the math model, and (c) testing exercises, including 
those to be used in assembling the equipment into a complete device. 

The time pattern for the computation must reflect the time development 
of the original situation, provide for the reception of input, and produce the 
activating output. Thus, one must have regimes of continuous change termin¬ 
ated by critical events in which there are abrupt changes of state variables. 

There are two ways in which this pattern is represented in the computer. 
If the development of the continuous parameters can be directly extrapolated, 
one can have a critical event simulation. In this type one has a succession of 
critical events, and at each such critical event the time of the predictable 
future ones are computed. On the other hand, if the change in the continuous 
variables is determined by a system of differential equations, the system must 
be solved by step-by-step methods, which advance the time by small fixed 
time increments. After each such advance one can test whether a change of the 
state variables will occur in the next time step and adjust the advance up to 
the critical point. In most cases a somewhat simpler procedure is acceptable 
in which one tests whether a critical event has occurred in the last interval, 
and if so, one assumes that the time of the critical event was at the end of the 
interval. 

In general, the analysis of the situation may be difficult to the point that 
given the present aspect, one can make predictions only for a short interval 
into the future. However, the fixed-time-increment procedure and automatic 
data processing, which can be programmed to iterate such predictions, 
yields numerical simulations under these circumstances. The other extreme 
would be a case where the variation of the continuous variables correspond¬ 
ing to any desired length of time and at each critical event could be computed, 
the time of occurrence of the next critical event could therefore be computed 
directly. For example, in a maintenance simulation, the time to next failure 
is usually determined by a probabilistic method. 

The programming pattern for a critical event simulation is based on a 
list of anticipated critical events arranged by time of occurrence. If the simula¬ 
tion is not in real time, time is advanced by taking the next critical event in 
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time and computing those future events that are predictable at this point in 
time and adding to the basic list. In a real-time simulation, a computer clock 
is tested at intervals until the next critical event in time is scheduled to occur. 
The more general situation requires the “fixed time interval approach." 
The continuous variables are divided into sets; one set is determined by a 
system of differential equations, another set may advance kinetically, and a 
third set may consist of functions of variables in the first two sets. Thus, at 
each time increment the advanced values of each set are computed succes¬ 
sively and then the possibility of a critical event is determined. When a 
critical event occurs, usually some computation is required to determine the 
new values of the state variables and to make the adjustments needed to 
initiate the new regime of time-increment advance. 

Input and output also determine characteristics of the time pattern of 
the computation. Consider the fixed time interval case. Input normally 
corresponds to the values of certain parameters, and these should remain 
constant during the computations associated with a given time value. 
Inconsistencies in the values of the parameters may well have objectionable 
mathematical and systemic effects. Thus, the input values must be buffered. 
When they are brought into the computer, they are placed in certain registers 
and introduced into the computation at the beginning of the computation 
for a fixed time value. These inputs can be introduced into the computer by 
means of an “interrupt" capability, which will transfer the input to the 
buffer registrars at any time without affecting the computation. Otherwise 
input periods are interspersed through the computation. 

It is a characteristic of output that data is required on a fixed schedule. 
Consequently, the computation for each fixed time increment may be seg¬ 
mented so that between each segment a certain amount of output is processed 
and communicated. 

The requirements for advancing the variables and input and output 
indicate the time pattern of the computation for a fixed time interval simula¬ 
tion. Relative to a critical event simulation, the possibilities are more complex. 
Input may only be required at each critical event, and if the simulation 
corresponds to real time, then the procedure at each critical event begins 
with an input sampling. It is also possible that certain inputs trigger critical 
events in addition to the precomputed events on the list. Certain outputs 
may be associated with critical events and occur with the corresponding 
computations. Scheduled output may be handled by an auxiliary program 
independent of the main simulation program. This may correspond to a 
special capacity of the computer to handle two independent programs, or it 
may be a specialized programming arrangement based on a computer clock. 

Thus, the time pattern of the data processing is determined by the time 
pattern of the original situation and the input and output requirements. This 
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time pattern may be realized by a sequence of instructions in the computer, 
i.e., by using the normal logical capability of the computer with perhaps a 
clock feature. However, they may also utilize special interrupt or multiple- 
programming features. The kinetic or dynamic character of the math model 
determines the major possibilities for the programming, but this must be 
supplemented in order to provide input and output. In the computation a 
continuous regime consists of a sequence of central processing instructions 
dealing with a change of the continuous variables. The end of such a computa¬ 
tion regime corresponds either to a critical event in the previous sense or to 
input or output. Clearly we can represent this computation pattern by a 
flow chart. The special interrupt or multiple-programming features relieve 
the need to make interruptions in the sequence of computer instructions and 
tend to make the program flow chart similar to the situation flow diagram. 

The output devices require not only data rates but also accuracy and 
resolution. The latter in turn are a requirement on the computation proced¬ 
ures that realize the math model. Like the computation flow chart these also 
must be documented. 

The program flow chart and the possible time patterns in the computa¬ 
tion are usually so complex that it is not practical to test all the logical 
possibilities. Consequently a certain number of patterns must be chosen to 
yield overall systems tests that will be reasonably conclusive. 


2.10. Programming 

It is of great practical importance to have the computer program in 
“modular form." A module may correspond to a continuous program regime 
or to subroutines, which may appear in a number of such regimes. Other 
modules may correspond to input or output processing or the computations 
associated with critical events. The program flow chart can be realized by 
appending to each module a decision block that will determine the next 
module. But greater flexibility is obtained by the use of an “executive pro¬ 
gram" to which the computation control returns after each module has been 
executed and where the next module to be executed is determined. The 
“flow chart" in the previous sense is now represented in a compact form in the 
executive. In this form, it is practical to use much more sophisticated flow 
charts and adjust them on the basis of experience. Each module must be 
computationally complete in itself and one cannot have numerical subresults 
dangling from one module to the next to avoid duplication. This juggling of 
numerical subresults was a darling of flow chart programming and made 
readjusting the program a hazardous process for anyone including the 
original programmer. 
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To illustrate these ideas let us consider in particular the programming of 
the OFT discussed in Section 2.5. In the usual flight phases the time develop¬ 
ment of the simulation is given by a system of differential equations. A 
numerical procedure is used to obtain the solution at times t 0 j 0 + h y t 0 -I- 2lu .. 
and the ‘"step size” h is also the time increment by which the simulation 
advances. (A multiple of h can be used as time increment to advance the 
simulation.) In order to obtain the required accuracy, this time increment 
must be chosen small, for example, 0.05 seconds. 

There is also considerable input and output that must be processed in 
each time increment. The input and output apparatus external to the com¬ 
puter is slower than the internal procedures of the computer, so that one 
cannot process input and output in one batch at each time increment. Thus, 
while the simulation is being advanced for a specified time increment, the 
input for the next time increment is being placed in an appropriate buffer in 
batches. Similarly the output from the previous time increment is being 
processed and sent out of the computer in batches. Because of the brief 
increment time, this phasing of input and output is not noticeable to the 
trainee. 

The total program appears as a complex of modules and the executive 
program that controls the sequence of execution. One can readily classify the 
types of modules needed, and this classification essentially determines the 
structure of the executive program. One has: 

(a) Input modules that operate on the information in the input buffers 
and introduce it into the program in an appropriate format and scale. 

(b) Output modules that express numerical information in the required 
output format and buffer it for the output process. 

(c) Testing modules for each flight phase that determine whether a 
change is to occur and the new flight phase. 

(d) Initiating modules that set up the computing process for a newly 
initiated flight phase. A new flight phase may use quantities not used in the 
previous phase, for example, and initial values for these may have to be 
computed. 

(e) Computing modules. 

The modules usually have to be executed in a definite sequence. Thus, 
the air density, Mach number, velocity, a, /?, etc. must be computed before the 
aerodynamic forces and torques in the wind axis system. The latter in turn 
are used to compute the corresponding quantities in the body system. When 
a numerical integration procedure is applied to a system of differential 
equations, all the derivatives must be computed at the same value of the 
independent variable, i.e., the time, in order to validate the customary error 
analysis and the choice of certain constants in the integration formulas. 
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Thus, in a given time increment, the first choice of the executive program 
will be a type (a) input program. This will be followed by a type (c) testing 
program to see if a change of phase is required. If a change occurs a type (d) 
initiating program will follow. After that one will have a sequence of type (e) 
computing modules interspersed with input and output modules. The 
sequence is terminated by an output processing, type (b), module. 

The resolution of the program into modules and the formulation of the 
executive program constitutes a structure for the computation that must be 
precisely documented. The logical requirements of the executive program 
must be specified as must the accuracy, resolution characteristics, and data 
rates for the calculations in each module. The overall test exercises prev¬ 
iously determined imply specific test exercises for the executive program, and 
it is usually desirable to expand these so that one obtains a sequence of tests 
that can be applied in a progressive manner, starting with the most straight¬ 
forward substructures. Similarly a system of tests for each module should be 
set up, structured on the tests implied by the overall system testing. These 
aspects should be incorporated into an overall structure document and 
specification documents for the executive program and the modules. 

Usually it is not possible to solve the mathematical relations in the math 
model by a finite arithmetic algorithm. Thus, one must use approximating 
numerical procedures in the computation. These procedures must have the 
required accuracy, must be stable, and must be applicable over the range 
needed in the simulation. The accuracy is specified by the data requirements 
and this also applies to the range. Stability refers to the reliability of the 
numerical procedure. Many of these procedures have a range of starting 
situations, but only a portion of this range will produce a reasonably accurate 
answer. The reader is familiar with the Newton-Raphson method for solving 
equations. Some methods for solving differential equations are always 
unstable, but others are stable if one limits the amount by which the independ¬ 
ent variable is extrapolated at each step. The last limit depends on the differ¬ 
ential equations to be solved. 

Numerical analysis provides many methods for obtaining approximate 
computational procedures. However, the associated discussion is usually not 
adequate to determine whether the requirements of the given situation are 
satisfied. Instead one must make tentative choices and then test the result for 
such properties as accuracy, stability, and range. These test can be run on a 
general-purpose computer, not necessarily the computer to be used in the 
simulation. This experimental programming will also yield the amount of 
computation needed to attain the desired properties. In general, the more 
accuracy that is needed, the more computation is required. 

Numerical analysis is an essential guide for these experimental compula¬ 
tions, and experimentation without this guidance is wasteful of time and 
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yields nondependable results. On the other hand, a preliminary mathematical 
analysis can be objectionable, yielding far too conservative estimates for 
accuracy and stability, and is inserted in the development so as to increase 
project time. The term “programming” may include the documentation of 
the block diagram, math model, etc. of the original situation, and it usually 
includes the process of determining the mathematical procedures to be used 
in the program. The description of these mathematical procedures is referred 
to as the math model for the computer program. 

The experimentation on which the choice of the computer math model 
is determined utilizes computer languages such as PL1 or Fortran. This 
experimentation may be continued into a first version of the major aspects 
of the simulation program. This first version can usually be debugged in a 
rather straightforward way using the general language facilities. The machine 
language program produced by a compiler is relatively inefficient, but tech¬ 
niques exist for improving the efficiency of such a program. Alternatively, 
after the precise procedures are determined, the program can be written 
directly in an assembler language. The final efficient machine language pro¬ 
gram can be debugged by means of numerical values obtained from the 
compiler program. It is desirable to obtain a precisely documented program 
math model and to base the actual coding on it. 

The equipment is usually obtained in the form of a number of com¬ 
ponents that are assembled in a progressive manner into a complete device. 
Programming is used to test this integration process as it proceeds step by 
step. 

I 

If' 

2.11. Management Considerations 

Figure 2.5 is a block diagram that indicates the logical sequence in the 
development of a simulation. The logical sequence is also the planned 
sequence of management on the assumption that the amount of experience 
available will be adequate to limit readjustment and recycling. The little 
circles are critical events and are usually associated with reporting require¬ 
ments, although the reporting requirements are more extensive than those 
indicated. 

Planning requires a report schedule and estimates of the various 
resources needed. Figure 2.6 is a report schedule. Connections could also be 
drawn to indicate the dependence of later reports on a given report so that a 
separate table could be set up. The purpose of these report connections is to 
indicate the effect of a failure to maintain schedule. They indicate that the 
work on different aspects is subdivided and tightly interconnected to mini¬ 
mize the total project time. 
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Figure 2.5. Block diagram for a simulation development. 
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Figure 2.6. Report schedule. 

A block-by-block scan of Figure 2.5 and the timing indications of Figure 
g 2.6 can be used to estimate the resources required. 

Senior personnel is usually assigned individually on a month-by-month 
basis, and the assignment of associates follows. This applies to the categories 
of supervisors, analysts, and engineers. In addition, month-by-month 
estimates are made of the number of various types of support personnel, such 
as coding assistants, key punch operators, technical personnel, and secre¬ 
taries. One must also schedule computer time, technical facilities, office 
space, and material. This planning permits cost estimates on a scheduled 
basis. 

Some understanding of management procedure is desirable for the 
professional employee, but our immediate purpose is to indicate the critical 
requirements for scheduling and documentation. Failure to maintain sched¬ 
ule not only leads to cost overruns, but manpower availability binds and 
similar binds for facilities and equipment. Diagrams similar to Figure 2.6 
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with the appropriate dependence relations are used to monitor the time 
development of the project. 

Notice that Figure 2.5 is a block diagram in the influence diagram sense 
and also in the sense of logical dependence. With appropriate connections, 
Figure 2.6 is a temporal flow diagram. The overlap in the time diagram, which 
seems inconsistent with logical dependence, is permissible because a pre¬ 
liminary report on one block may permit work to be initiated on a logically 
consequent block. This overlap may be highly desirable both for total time 
and the availability of certain resources. Overlap may also be helpful in both 
anticipated and unanticipated feedbacks in the total development. 


2.12. Validity 

A simulation is a contrived experience reflecting some original situation. 
We have seen the procedures used to set up the simulated situation, and it is 
clear that the faithfulness of the result depends on the scientific and technical 
understanding of the original situation. 

In general, the scientific principles that govern the situations of interest 
are well known. But the complexity is in most cases so great that a complete 
scientific understanding in the sense of a mathematical description is impos¬ 
sible. The more limited mathematical formulations that are available usually 
cannot be solved. However, the development of large-scale computing 
facilities made it possible to deal with a considerable range of technical 
understanding in the form of a combination of scientific theory and empirical 
information. The empirical information contains direct quantitative data, 
but it also contains mathematical relations and behavior patterns, for 
example, stress and strain relations in materials. It is this combination of 
scientific theory, empirical experience patterns, and empirical quantitative 
information that is integrated into the math model. 

Thus, over a considerable practical range technical understanding cuts 
through the Gordian knot represented by our lack of complete scientific 
analysis and mathematical solutions. But one can also consider the situation 
from another point of view. The term “technical understanding" can also be 
considered to include experience patterns that do not have a mathematical 
formulation in the usual sense. Many human activities were based on this 
technical understanding, including agriculture, the production of metals, 
and the construction of carriages and sailing vessels. One can consider the 
introduction of scientific theories as an extension and structuring of previous 
technical experience, permitting a considerable increase in accomplishment. 
Metallurgy was tremendously improved by the knowledge of the micro¬ 
scopic structure of metals and the phase characteristics of alloys. Modern 
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theories of heredity greatly assisted agriculture and husbandry. The Wright 
brothers invented the airplane using Newton’s law and empirical information 
about the action of airfoils and the Otto engine. Thus, one can consider the 
scientific element in technical understanding as a growth factor that enhances 
the available procedures or permits a significant breakthrough, not a weak¬ 
ened version of a perfected understanding. The focus of consideration should 
be the understanding of the complete activity, not a theoretical scientifically 
complete background. Actually this is a justification for increasing the 
scientific understanding. This increase takes on many forms, from the purely 
academic to what may be termed “technical research.” Since the objective 
is the math model for prediction and control, technical research deals with 
an understanding of experience patterns combining scientific theory and 
empirical behavior. For example, technical research deals with fluid flows 
and the behavior of substances under heavy stress empirically, but in a 
framework of dimension theory and thermodynamics. 

This nature of the technical understanding is reflected in many aspects 
of these efforts. For example, organized team experience becomes extremely 
valuable. It permits one to begin with a general structure of understanding 
based on past experience. Of course, the other side of the coin is the capa¬ 
bility of adding new aspects to this understanding. One very obvious element 
in modern technology is the intrusion of additional scientific theories. The 
team understanding must be responsive in this direction also. But technical 
understanding also contains an ad hoc element in its choice of experience 
patterns to determine the math model. This implies that there is an essential 
requirement to test the understanding against experience. One can begin, 
we hope, with a good approximation to the desired math model, but experi¬ 
ence will practically always show a need for adjustment. Experience with a 
prototype invariably results in design changes and changes in the math 
model. A highly significant aspect of the use of digital computers in various 
devices is that experience patterns with these computers have become 
stabilized and that they do not require “engineering.” 

This need for readjustment and the ever present possibility that a 
project may be shown to be not feasible and must be terminated has serious 
practical consequences. People with poor technical understanding invariably 
consider a venture that has to be terminated as a “waste.” This type of 
judgment is analogous to overconservative bridge-game bidding. It permits 
no innovative explorations to take advantage of new developments. No bet 
is ever made because the stake may be lost. 

Technical understanding is the basis for the math model and the faith¬ 
fulness of the simulation. However, one must also take into account the fact 
that faithfulness of the simulation is not the primary objective. These simula¬ 
tions are part of a larger project, and the main purpose is to permit the effort 
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to proceed. This continuance requires that certain decisions be made and 
justified. However, this justification may not need, logically, that a faithful 
representation be obtained, or one may be willing to proceed on a basis that 
is not logically complete. The simulation cost must be weighed against the 
risk of a limited answer. 

We have noted at least three types of understanding that will permit such 
an effort to continue. But even the most cursory technical decision will be 
based on some quantitative estimates, and thus these understandings are 
mathematically formulated. Quantitative understanding yields prediction 
and control and is an essential part of large efforts. It is the most characteristic 
aspect of our culture, and its complexity reflects forty centuries of develop¬ 
ment. The applied mathematician will find that there is no simplistic philo¬ 
sophical approach that is a satisfactory substitute for an appreciation of the 
historical development. 


Exercises 

2.1. Consider the following devices from the point of view of simulation in terms 
of (i) different regimes of activity, (ii) snapshot block diagrams for an instant in time in a 
regime, (iii) critical events, (iv) overall block diagram, (v) flow chart, (vi) math model: 

(a) The mechanism for controlling the water action in a water closet 

(b) A spring-driven or weight-driven clock or watch 

(c) The mechanism of an automatic-fire weapon such as a repeating rifle or 
machine gun 

(d) The internal ballistics of discharging a gun 

(e) A one-cylinder, two-sided piston steam engine 

(f) An ac-dc electric motor 

(g) Ac only electric motors 

(h) A two-cycle one-cylinder gasoline engine 

(i) A four-cycle one-cylinder gasoline engine 

(j) A rotatory internal combustion engine 

(k) A rotatory printing calculator that adds, subtracts, totals, subtotals, and 
clears 

(l) An electronic flip-flop circuit in multivibrator mode; also in one-shot 
mode 

2.2. Analyze the procedural steps required in planning the purchase of a new home, 
a new car, or a major appliance. What is the relation of advertising to an analytic 
approach to purchasing? 

2.3. The research, development, and design phases of a project produce nothing 
tangible, so one can cut costs by minimizing these. Discuss the proposition that the least 
expensive project will be the one with the cheapest research and development stages, and 
support with reference to actual cases. 

2.4. Cost estimates for the total project are given in the requirements stage, research 
stage, prototype testing, and all through the production phase. What happens to these 
estimates and why? 
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2.5. Compare theoretical and empirical methods for investigating steady fluid 

flows and shock waves. . 

2.6. For debugging purposes it is desirable to specify the scenarios that cannot 
occur or are impermissible. One can assume “start and end as critical events, con¬ 
nected to certain other critical events by regimes called “initiation and termination, 
respectively. Permissible scenarios must connect “start to end and can be subject 
to other restrictions. One question of interest is to replace a given flow chart and permis¬ 
sibility limitations by a flow chart with the same labels for critical events and regimes 
in which all scenarios correspond to permissible scenarios of the original problem. What 
axioms for defining a concept of the set of permissible scenarios will yield this result? 

2.7. Discuss the scenarios for the devices in Exercise 1. Express the desired actions 
in terms of scenarios and also the unsatisfactory actions. What options does one have to 
insure that only the desired action will occur? 

2.8. Consider a business situation in which four types of events occur essentially by 
chance and each type produces a characteristic piece of information, i.e., a data record, 
that must be processed. Let p t be the probability that a given event be of type i and a, the 
corresponding processing time in miliiminutes. The number of events per minute N[t) 
can be considered as determinate as a function of time of day and season. The data 
processing system contains an interrupt feature that, independently of the normal 
processing, places each data record in an appropriate file and maintains files of arrival 
times and number of records to be processed. The processing may be subject to various 
requirements, such as a priority system or that of minimizing the maximum waiting 
time. Discuss an executive-type program for handling this situation and modifications 
of the notion of scenario to handle situations involving chance. 

2.9. In a “map war game,” a conflict between two opposing forces, (traditionally, 
blue and red), is represented as follows. Each side has a command team consisting of the 
commanding officer and his staff. Military units are represented on a map. The effective 
map, however, is viewed only by a third group, “the umpires,” who also have manpower 
and logistic records for each side. Each side communicates with the umpires in a form 
corresponding to the issuance of orders to its units and the return of intelligence informa¬ 
tion from its units. The umpire group continuously decides what happens in regard to 
motion, firepower, casualties, and logistics as a result of the orders and the map situation. 
Each side maintains its own version of the map situation and records. Such games are 
used to test plans and for training. (There is a three-board form of chess on the same 
principle, called Krieyspiel.) 

Which aspects of simulations discussed in the text are present in this game and which 
are missing? What arc the advantages and disadvantages of this procedure? Compare 
this with a “computerized war game” in which the umpire group and the effective map 
are replaced by data processing equipment. 

2.10. Describe an executive program suitable for (a) a critical event simulation, 
(b) a fixed time increment simulation, (c) a “track while scan radar computer program, 
(d) a business record processor as in Exercise 1.8. 

2.11. The executive program must choose the next module on the basis of informa¬ 
tion available in the computer and choice criteria. Analyze the forms this information can 
take and the possible executive programs. 

2.12. A cathode-ray-tube display is either static or gives the impression of motion 
by a succession of frames, each of which is a complete static picture. But such a static 
picture is actually generated by the trace of a fast-moving spot w r hose x and y coordinates 
on the face of the tube and intensity are functions of the time. Describe the various ways 
in which a static picture can be analyzed into elements that have a precise numerical 
description and can be represented on a cathode-ray tube. A complex of such elements 
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can then be represented in the computer as a “stored image” corresponding to a com¬ 
plete cathode-ray-tube display. 

Many procedures have been developed to represent situations of various kinds by 
a two-dimensional display or by some combination of such displays. The corresponding 
stored images can be developed and manipulated in a consistent manner by the data 
processing system. Communication with people professionally involved has been well 
developed for such computer systems, and many operations that previously had required 
a great deal of human effort and skill are readily available as computer output. 

„ Thus, one has a concept, analogous to language, with, however, a greater potential 
for simulation of both spatial relations and temporal developments. Discuss this concept 
relative to the design of machinery, architecture, and surgery. Include the possibility of 
automatic realization as represented, for example, by computer-controlled machine 
tools. 

2.13. Discuss procedures for obtaining the characteristic roots of a symmetric 
matrix; also, methods for obtaining the largest root or the least in absolute value. 

2.14. In view of roundoff, what does it mean to say that a set of variables satisfies 
an algebraic equation in the computer? Consider the three possibilities, single precision, 
double precision, and floating point. Given a nonsingular system of simultaneous linear 
equations and a set of values for an unknown that satisfy the equations in the computer, 
what can be said about the accuracy of the solution in each of the above possibilities? 
Discuss iterative procedures for solving simultaneous linear equations. 

2.15. Three equations F,<x 1 ,x 2 ,x 3 ,f)=0, i= 1,2,3, determine xj,x 2 ,x 3 as func¬ 
tions of /. Describe a process for obtaining X|,x 2 , x 3 corresponding to a stepped se¬ 
quence of values of r. How can one establish the stability and accuracy of such a process? 

2.16. Consider a system of n ordinary differential equations on n unknowns. How 
does a solution, i.e.. a set of n functions, depend on an error in the values of the initial 
conditions, assuming the error is quite small? Show how the coefficients of the errors 
can be obtained as functions of the independent variable by solving systems of In 
ordinary differential equations on In unknowns. 

2.17. In the step-by-step integration of a system of ordinary differential equations, 
one can regard each step as yielding the correct advance values from the previous values 
plus an error that can be regarded as an error in the initial conditions for the rest of the 
computation. This can be utilized to express the total error after a number of steps in 
terms of the error at each step. Various integration procedures yield step errors pro¬ 
portional to a power of the step size greater than one. Discuss the convergence of such 
approximations to the solution as the step size diminishes, ignoring the effect of round¬ 
off How can this be used to estimate the error in integration procedures as a function of 
step size? What is the effect of changing the scale of the dependent variable? What is the 
effect of roundoff? 

2.18. A step-by-step integration process replaces a system of n differential equations 
by a system of difference equations. Discuss the stability of this process. 

2.19. There are a number of procedures for obtaining solutions of linear partial 
differential equations with appropriate boundary conditions. Survey the applications of 
(a) separation of variables, (b) Green's function, (c) methods of characteristics, (d) relaxa¬ 
tion methods, (e) finite difference procedures for parabolic and hyperbolic equations. 

2.20. Discuss logically structured procedures for debugging a simulation program. 

2.21. Training equipment based on simulation is used for many purposes, such as 
submarine crew training, carrier aircraft landing, astronaut docking maneuvers, and 
command control of supertankers. What would be appropriate equipment block 
diagrams, information flow charts, and integration plans in each case? 

2.22. What professional training would be appropriate for simulation analysts? 
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2.23. The thermodynamic state of substances is important for the simulations of 
steam engines, internal combustion engines, internal ballistics of guns, explosions, and 
shock waves. An associated concept is that of ignition or the propagation of flames. What 
are the names and definitions of the thermodynamic and ignition variables that appear 
in these simulations? 

2.24. Obtain the block diagram, math model, and flow chart for any system in the 
following exercises from Chapter 1 that interests you Exercise 4, 10, 11, 16, 20, 21, 22, 

23,24,26. . 

2.25. Investigate the procedures used to simulate optical devices, in particular, 
telescopes, microscopes, binoculars, range finders, cameras for photography, television 
cameras, snipcrscopes, and the eye. How is ray tracing related to the lens formula ? 

2.26. A recent development has been the formation of three-dimensional images 
by holography. Investigate the mathematical basis of this concept. 

2.27. Obtain block diagrams, math models, and flow charts for a pocket-size 
electronic calculator. (Sec McWhorter. <4) ) 
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Understanding and 
Mathematics 


3.1. Experience and Understanding 

We will now consider a general framework for “understanding.” As 
individuals we deal with an environment that affects us either favorably or 
unfavorably, and our continued existence depends on our interaction with it. 
Experience is this process of dealing with the environment. Experience is 
certainly continuous, but we consider that we can resolve it into unit sub¬ 
procedures in which we deal with a “situation” and use our understanding to 
guide our actions to produce a favorable result. Clearly this understanding 
is based on past experience or on learning, which corresponds to the experi¬ 
ence of others. Preceding experience must, therefore, be structured into 
patterns that we can recognize in the situation we are dealing with. 

Dealing with such “units” of experience occurs on many levels of com¬ 
plexity, from our reactions to situations of immediate danger through many 
types of problems of everyday living, to the simulations we discussed in the 
previous chapter. These levels certainly differ in the extent to which we are 
aware of the structure of the process, and our interest is with those levels in 
which the relation of understanding and experience is explicit to the extent 
that communication between individuals and mutual action is possible. 
Communication and joint action also imply mutual understanding, which 
must include a common structuring of past experience. Part of this common 
structuring is educational in origin, part of it is the result of communicated 
individual experience, part of it is due to mutual experience, and there may 
be important elements that are innate or hereditary. 

In order to associate patterns of past experience with the current situa- 
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tion, understanding must function in two complementary ways. As an integral 
part of the awareness process it produces in our consciousness a conceptual 
analog of the situation and it also produces experience patterns associated 
with the concepts involved, thus permitting us to explore, in our imagination, 
possible experiences corresponding to optional actions on our part. 

The conceptual analog of the situation appears in terms of concepts that 
permit the formulation of experience patterns. For communication purposes, 
these concepts must be associated with symbols; for example, names and 
mutual understanding must permit us to associate the symbols and concepts. 
For each individual to have a correct version of these concepts, there must be 
a community sharing of experience and association of the symbols with the 
appropriate experience, and this is the function of education. The formulation 
of concepts may represent an innate tendency of our minds, but even if this 
is so, communication and mutual interaction of individuals certainly seem 
necessary in order to insure the similar nature of a concept in different 
individuals. 

Examples of such concepts are represented by the forms of classic^, 
logic; for instance, the terms “man,” “dog, and “triangle represent a 
division of individual objects into classes we can readily recognize. Each of 
these classes is associated with a specific and complex combination of experi¬ 
ence patterns. Reasoning associates this combination with the objects 
involved. We recognize that Socrates is a man and we know that men are 
mortal and we infer that Socrates will die. We know that dogs bite, meter 
maids give parking tickets, and 13 is an unlucky number. 

This conceptual division into classes has certain general properties 
associated with the idea of class or set itself and with the identity of objects. 
The inclusion relation has certain properties, as does the notion of being a 
member of a class. These constitute an experience pattern called “logic,” 
and this pattern is widely applicable. The terms “and,” “or,” "implies, 
“all,” and “there is one such that” refer to logic. Language permits this 
conceptual experience to be shared and common agreement. It also permits 
one to check the conceptual development against the generally accepted 
characteristics of the experience patterns. 

Thus, we have a formulation of experience, or at least part of it, in terms 
of objects and classes. Since this experience pattern is expressed in terms of 
symbols and language, it can be considered to be abstracted from the specific 
realization and becomes an intellectual experience itself. When the basis of 
such an intellectual experience is precisely described relative to the symbolic 
representation, the intellectual experience can be shared by many individuals 
and effectively utilized to structure experience. 

“Logic” in the above sense is often considered part of an extended intel¬ 
lectual system that claims to represent all experience patterns. Such an exten¬ 
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sion is called "philosophy,” but there is no universal agreement on what a 
correct philosophy is. "Logic” itself can be considered simply as a format for 
experience. This format is particularly valuable when applicable. 

Mathematics consists of more general abstracted experience patterns, 
which include the logical pattern. The simplest of these and the one that has 
the widest pattern is that which includes counting, the natural numbers, and 
the notions of objects and sets and is usually referred to as "elementary 
arithmetic.” As in the simpler case of logic, an area of mathematics is an 
intellectual experience associated with symbols and corresponding to an 
abstracted complex of experience patterns. The symbolic representation 
permits it to be shared and verified. To be useful as a experience format, it 
must be free from contradiction. Whether the pattern provides an effective 
conceptual structure for any specific situation must be verified by experience. 


3.2. Unit Experience 

Let us now describe by means of a flow diagram (Figure 3.1) a conceiv¬ 
able experience unit of a relatively simple nature. The experience units of 
interest to us are on a more sophisticated level, but we will show how this 
level can be obtained by a sequence of refinements. The nature of these 
refinements will be highly significant for us. 

The flow diagram represents the sequence of actions of an individual. 
The individual directs his senses to observe a situation. This process yields 
information that is associated with available concepts to produce a concep¬ 
tual analog of the situation. However, this procedure is not a unidirectional 
flow, but is interactive. The information suggests concepts that yield direc¬ 
tions for additional observation. The diagram does not try to represent the 
various readjustments that are part of this process. Direction is indicated 
by an encircled D. 

Available experience patterns are applied to the conceptual analog in 
order to consider possible actions. The logic patterns involving objects and 
classes are frequently useful to structure the intellectual development, but 
this is also true of the more general patterns of mathematics, especially 
arithmetic. In this imaginative development one also assumes various 
possible actions—one’s options. It may also be true that the experience 
patterns available may not permit one to decide on a course of action, but the 
major objective is precisely such a choice. 

If a decision is made and action is taken relative to the situation, this 
action results in certain effects on the individual. These effects are compared 
with those anticipated from the reasoning. If these are in agreement we con¬ 
sider the understanding involved in our conceptual analog that is the reason- 
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ing, concepts, and experience patterns as satisfactory. Normally this is a 
very important element in our sense of security. If there is a disagreement we 
feel an urgent need to correct our conceptual analog of the system by further 
observation, and if our awareness seems satisfactory we must check our 
reasoning and if necessary our concepts and experience patterns. If it appears 
that these must be adjusted, then additional experience usually is required 
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corresponding to the logical exploration needed to establish the new experi¬ 
ence patterns. 

Notice that there is the possibility that the concepts and experience 
patterns may be modified or adjusted. If we consider the combination of 
concepts and experience patterns as a body of knowledge, then there are 
various criteria for the validity of such knowledge that can be applied: 

(a) Correct guidance is always obtained. 

(b) Correct guidance is obtained when explicit tests are made. 

(c) There is a sequence of modifications each of which yields correct 
guidance for a more inclusive set of circumstances. 

(d) Satisfactory guidance is obtained in the sense that the cases of 
incorrect guidance are considered unimportant. 

(e) One can introduce modifications on a given set so that correct 
guidance can be obtained in any case of interest. 

Clearly this list can be extended, but it does illustrate how much latitude 
exists in the concept of a satisfactory body of knowledge. 

There are various philosophical points of view that assume that certain 
aspects or blocks in this diagram generate all the others. For example, there 
is the point of view that the line joining “situation” and “observation” is 
the critical origin of the diagram that starts all else. Other points of view will 
begin with either the “concepts” box or the “experience pattern” box and 
assume that these are generative. We will simply assume that at any time, 
experience will at least contain the indicated complications, that all the 
elements will evolve in time, and if some one wishes to understand the overall 
development or even its present nature he must look for as much of the past 
history as is available. 

For an introduction to various philosophical approaches, one can 
consult Joad. (4) The Aristotelian “universe of individuals” is described by 
de Wulf. (2) 


3.3. The Exact Sciences 

We now consider various refinements in the experience scenarios given 
in the preceding section that correspond to the cases of primary interest to us. 
It is convenient to discuss these refinements in terms of a block diagram in 
which certain procedures are lumped together under rather obvious headings 
(Figure 3.2). Awareness yields a conceptual analog in our mind that must be 
matched with structured experience to permit reasoning concerning out¬ 
comes. Understanding corresponds to the combination of concepts and 
experience patterns that is the basis of reasoning. We are particularly inter- 
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Figure 3.2. Simplified version of interaction. 


ested in the case where the conceptual experience patterns include those that 
are "mathematical.” 

A mathematical intellectual experience can be considered constructive 
inasmuch as the various patterns or experience can be combined into larger 
complexes, and such a construction has a symbolic representation. The usual 
mathematical operations are steps or units elements in these constructions. 
Thus, the imaginative development of one person can be communicated and 
also symbolically checked by himself and others. Hence, mathematics is 
particularly suitable for planning and cooperative activity. 

The constructive nature of the operational procedures permits consider¬ 
able imaginative exploration in any situation that can be linked to mat e- 
matical concepts. The extensive development of the mathematical operational 
procedures themselves is available and is highly useful. We have seen in 
simulations the production of mathematically based time histories to estab¬ 
lish possibilities. Thus, when mathematics can be used it greatly extends 
the experience patterns that are available from logic. 
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Many sciences have such a mathematical aspect. A science is an overall 
complex of methods for dealing with a milieu of experience. In the “exact” 
sciences, there is a central mathematical formulation that permits a purely 
mathematical investigation of the situations of the experience milieus. 
Thus, a given situation of this type can be considered to have a math model, 
which is obtained by an analysis of the situation in terms of the conceptual 
experience patterns of the science. 

The analysis of the situation can be considered to be the equivalent of 
setting up an influence block diagram in which the blocks correspond to the 
concepts of the science. In the exact sciences the concepts are associated with 
precisely disciplined procedures, for example, measurements, whose outcome 
is mathematical. Typical examples are lengths, volumes, weights, angles, 
temperatures, and electric currents. Mathematical relations between these 
outcomes that yield satisfactory predictions for dealing with the milieus 
constitute the math model. One can have functional relationships or more 
subtle ones involving derivatives or probabilities. Mathematical procedures 
are available for reasoning based on the known mathematical relations. 

Sciences are usually described as “bodies of knowledge,” and there is an 
implicit separation of the “theory” or mathematical formulation of the 
structured experience from the rest of the understanding. Furthermore, this 
theory or mathematical formulation is considered to contain all the “knowl¬ 
edge” of the science. But such a theory must be supplemented by a procedural 
interpretation of the concepts. The applied mathematician must appreciate 
that “scientific understanding” refers to the complete procedure for dealing 
with a milieu of experience including the actual operational procedures. 
Changes and improvements in the latter are often the essential aspect of a 
scientific advance. 

This is important also for the following. By definition, the reasoning 
associated with an exact science can be expressed in symbolic mathematical 
form. But normally, the reasoning is not confined conceptually to purely 
mathematical procedures. A wider range of conceptual reasoning is used 
based on the additional concepts and experience relations of the science itself. 
The mathematician must learn that this is quite appropriate. The concept 
of a “particle” in physics has a set of mathematical variables that specify it 
precisely, and yet conceptually one prefers to reason with it as an idea in its 
own right. 

Our distinction between the mathematical model and the rest of the 
understanding is relatively modern and corresponds to a belief in the con¬ 
ventional and inventive characteristics of mathematics. The natural phil¬ 
osophers of the eighteenth century believed that both mathematics and the 
various scientific logical developments were part of an axiomatic structure 
by which we comprehended nature. 
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3.4. Scientific Understanding 

As “understanding” the conceptual aspects and mathematical theory of 
a science is subject to the requirement of being a continuing satisfactory 
match to a milieu of experience. This match is usually thought of as involving 
the maximum available exploratory testing, i.e., ‘ experimenting, but it must 
also include the procedures of applied technology. The intellectual develop¬ 
ment of a science is part of the broad experience by which a culture advances 
and yields advanced technologies. On the other hand, technical improve¬ 
ments expand the available areas to which the scientific understanding is 
applicable and are incorporated into the operational aspects of scientific 
concepts. The expansion of the milieu and the improvement in conceptual 
procedures frequently require essential changes in the intellectual aspects of 
science. 

The mathematical formulations of technical procedures have con¬ 
tributed very significantly to the spread of applications, as for example, in 
circuit theory, fluid flow, and statistics. Usually the basis of a technological 
advance is some phenomenon for which a scientific basis has been developed 
in the form of quantititive mathematical relations. The latter is then adapted 
to the needs of the technical situation and becomes a part of the engineering 
repertoire. This advance of technology due to scientific developments is 
probably very familiar to the reader. 

On the other hand, many fundamental improvements in scientific 
understanding have resulted from advances in instrumentation, and these 
advances were due to technological advances. Such developments are of 
course well known. The use of optical instruments was a critical advance in 
many sciences—the microscope in biology, the telescope in astronomy, the 
spectroscope in physics and chemistry. Before the telescope, ancient instru¬ 
ments permitted the original Ptolemaic description of the solar system, but 
improvements in these instruments were adequate to support Keplers 
laws and their Newtonian consequences. 

An educated person should certainly be aware of this complex interac¬ 
tion of science and culture. One aspect is rather important for the applied 
mathematician. He must appreciate that scientific concepts are based on 
disciplined technical procedures and that the consequences of a mathemat¬ 
ical theory must be interpreted in terms of these concepts. This applied in 
particular to the “data” that must be used in technical applications. The 
experimental context of such data must be carefully understood and may 
correspond to essential limitations. The precise procedural equivalent of 
various notions is usually very interesting, especially from the point of view of 
the manner in which the basic procedure has to be supplemented. Useful 
information about the length of a bar of metal may have to include the tern- 
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perature and tension. The measurement process becomes increasingly com¬ 
plex as one deals with the distance concept in machine parts or variations on 
the surface of a telescopic mirror. On the other hand, the distance to heavenly 
bodies requires a pyramid of procedures. In this case it is clear that the 
simplicity and computational aspects of the mathematical concept have been 
retained at a considerable expense of complexity for the experience patterns. 

This discussion indicates the problems associated with the awareness 
process when scientific analysis is required. The mathematics to be used may 
be available in standard form, but not the data that specify the situation. 
Experimental information must always be considered in the context of the 
actual procedures, and these procedures may be quite disparate, even when 
the same expression is used and has the same mathematical significance. 
For example, the expression ‘speed of sound in sea water” has to be inter¬ 
preted very carefully to insure that one knows what parameters (such as 
temperature, density, depth, and salinity) have been taken into account and 
how they were considered. “Scientific facts” or “scientific data” are significant 
only in a framework of scientific understanding. Experience has shown that 
theoretical unifications based on purely conceptual or mathematical identity 
can lead to false results. 


3.5. Logic and Arithmetic 

We frequently use the experience pattern of objects, classes, and natural 
numbers to structure situations. We make up shopping lists, laundry lists, 
lists of our possessions, and lists of our investments, and we make numerical 
comparisons and evaluations. Thus, arithmetic is an extension of elementary 
logic. One uses set notions to divide objects into various categories and 
express the relation between categories. Many aspects of the situation are 
expressed in terms of counting and the arithmetical operations. For example, 
the total cost of a shopping list is evaluated by these means. Thus, elementary 
arithmetic partakes of the nature of logic and deals with the same general 
experience patterns. 

Agreement on these experience patterns relative to the counting process 
and its outcome, numbers, is of great practical importance. Thus, one must 
have agreement on the symbols for numbers and the effect of the arithmetic 
operations. The abstract conceptual situation is such that specific experience 
can be used to establish general relations, i.e., since two apples and two apples 
make four apples, we know that 2 4- 2 = 4. But this also means that one can use 
ad hoc simple examples such as set of marks to establish arithmetic relations 
by applying the appropriate set of operations; for example, “six times seven 
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is forty-two” can be shown by taking six sets of seven marks and counting 
them. 

But in all cases of practical interest, this unsophisticated approach must 
be supplemented by using the notion of aggregation, or forming sets, for 
counting purposes. This is extremely familiar and results in our decimal 
notation and the abacus. Sets are represented on the abacus by a uniform 
system of aggregation, and the use of the abacus corresponds to the effect 
of the basic set operations on counting. The decimal notation corresponds 
to a somewhat greater use of symbolism but has essentially the same character. 

It is interesting to notice the analog character of the abacus. Thus, if a 
situation is reflected into prices, the value relations can be represented on the 
abacus. 


3.6. Algebra 

In planning, problems occur that are solved by a sequence of arith¬ 
metical operations called an algorithm. Our earliest known historical ex¬ 
amples of arithmetic are in this form. A specific problem is stated and an 
arithmetic procedure is given to solve it in terms of the numbers in the 
problem. The reader is supposed to recognize that this pattern of operations 
can be applied to a class of problems. 

Abstracting this experience into effective symbolic form occurred in 
historical times and led to what is now termed “algebra.” The symbolism 
includes representations of numbers either unknown or unspecified and 
operations. Relations that are consequences of the arithmetic operations 
are expressed by equations that can be manipulated to yield other relations. 
Such manipulations reflect basic set relations and cannot be established by 
observing their effect on specific numerical relations. They must be referred 
to the set relations on which arithmetic relations are based. They constitute 
experience patterns abstracted from the basic arithmetic ones. 

In modern mathematics, there are notations for sets and sets operations 
in which characteristic properties can be expressed. The arithmetic opera¬ 
tions are defined in terms of set operations. For example, the sum corresponds 
to the union of disjoint sets and the product of numbers correspond to the 
cross product of sets or the set of pairs. Thus if n, is the cardinal number of the 
set, A h i=l, 2, the sum corresponds to A x \JA 2 and n x n 2 corresponds to 
A ! x A 2 . The properties of the arithmetic operations, for example, commuta¬ 
tivity, associativity, and distributivity, follow from those of the set operations. 

This notation corresponds to a further level of abstraction. A number like 
3 or 7 corresponds to an abstraction from the counting process, which itself 
must be considered as a relatively general experience pattern applicable to 
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sets. Algebra deals with an abstraction process for experience patterns 
involving numbers in general and their properties. These properties are also 
associated with general relations between sets. These successive layers of 
abstraction from experience patterns also illustrate the role of symbolism in 
such abstractions. 


3.7. Axiomatic Developments 

Numbers represent abstractions from experience patterns, but we now 
regard them as objects subject to the set operations that we have applied 
previously to the objects of direct experience. In particular we have a set of 
all natural numbers and we can consider subsets and set operations such as 
union, intersection, formation of a set of pairs, and consequence (the notion 
of one-to-one correspondence). This is a considerable expansion of mental 
experience possibilities. We do not deal with infinite sets of objects, for 
example, in normal experience. 

Since the set of numbers constitutes an independent set of objects, we 
can structure the associated experience axiomatically. This means that certain 
characteristic properties and relations are postulated and all other properties 
and relations are deduced from these characteristics by logical operations, 
that is, by set properties and constructions. 

The standard way in which this is done for the natural number is with 
the Peano axioms (see E. Landau. Foundations of Analysis^'). The set, A, of 
natural numbers is characterized by two set-theoretic properties and a spec¬ 
ific correspondence, n^ri. We have 

(1) 1EN 

(2) fieAM3n')(«'€A0 

(3) («)(«'* 1) 

(4) n' = m'->n = m 

(5) (MCN)n(lEM)n[(nGM)->(n / GM)]-M = A. 

We have used the notation of graduate mathematics for set relations. While 
Axioms 1-4 deal with N and its elements. Axiom 5 is a statement about any 
subset M of N and is a conceptual expansion. 

On the basis of these five axioms one can construct functions of two 
variables, “plus” and “times,” which have the appropriate properties, and 
also a notion of “less than.” We have a one-to-one mapping of the natural 
numbers abstracted from immediate experience into this axiomatically 
defined set. The logical development based on the postulates is essentially 
different from the arguments based on abstraction. 

Having defined operations such as addition and multiplication for the 
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natural numbers and having established their properties one can consider a 
set of abstract objects subject to these operations. This set of objects is then 
“defined axiomatically” by the existence and properties of the operations. 
There is considerable leeway in choosing the properties associated with the 
operations, and this choice leads to various “abstract algebras.’’ 

An axiomatic discussion presents a logical structure that is quite valu¬ 
able in obtaining mutual agreement. The alternative is multiple ad hoc 
arguments. The axiomatic logical structure is independent of any preceding 
abstraction process and presents its own criteria of rigor. 

Axiomatic discussions are also applicable to areas of experience not 
associated with discrete objects. Measurements deal with procedures for 
comparing magnitudes such as line segments or surface areas. The naive 
approach to comparing two magnitudes such as a pair of line segments 
would be to express both as multiples of a common unit and refer the situation 
to fractions or pairs of integers. But even in ancient times this was known to be 
inadequate, and Euclid’s Elements (see Heath (3) and Morrow (6) ) contains a 
more sophisticated approach. 

Geometry deals with a system of interrelated ideal objects—points, 
lines, planes—that clearly corresponds to an abstraction from spatial 
experience. This system is described axiomatically with the corresponding 
logical structure. In Euclid, arithmetic and algebra were incorporated into 
geometry by axioms. 

Elementary analysis (see Landau 15} ) can also be given in terms of an 
axiomatic definition of real numbers and certain set ideas. In general a 
mathematical axiomatic discussion deals with a set of ideal objects subject 
to certain experience patterns abstracted from some area of previous experi¬ 
ence. The ideal objects and the relations between them have a symbolic 
reprt entation that can be used to insure that the imaginative development is 
consistent with the specified experience patterns and can also be used for 
communication. Clearly these imaginative developments can be piled one 
upon the other, i.e., the experience patterns of one can be the basic axioms of a 
higher development. From our present point of view, the associated symbolic 
construction is a representation of a purely intellectual development not the 
development itself. 


3.8. Analysis 

The natural philosophers of the eighteenth century expanded the 
experience areas involved in the axiomatic development to include motion, 
gravity, electricity, magnetism, temperature, fluid flow, and the elastic 
properties of matter. But this was done at a price. The contact of the intellec¬ 
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tual formulation with experience in these new areas was by experiment. 
However, it was clear that one could separate out a central core of mathe¬ 
matical procedure, not dependent on experiment, that was used in reasoning 
and prediction by computation. This central core was called analysis and it 
did represent conceptual experience adequate to provide math models for 
natural philosophy. 

In modern terms analysis corresponds to the theory of functions of a 
number of real or complex variables. In its original form it was not dependent 
on experiment, but it did involve computational procedures using infinite 
sets of quantities that had meaning only because of ad hoc intuitive appeals. 
Furthermore, analysis did not stand by itself as an axiomatic development 
but was supported by intuitive appeals to the original unabstracted situation. 
The notion of function and limit, for example, were based on the idea of 
motion. 

These logical difficulties were understood. In addition the discovery of 
non-Euclidean geometry showed the imaginative and inventive character of 
mathematics. Thus, it seemed desirable to set up analysis as an axiomatic 
structure in which the constructive procedures utilized only set-theoretic 
methods. 

The resulting axiomatic development begins with the Peano structure 
for the natural numbers and establishes the arithmetic properties and the 
notion of “less than.” The positive rational numbers are defined as sets of 
equivalent pairs of natural numbers. Two pairs (m ly n l ), (ra 2 , n 2 ) are equival¬ 
ent if m l n 2 =m 2 n l .The algebraic properties of positive rational numbers in¬ 
clude the group property for multiplication. The extension of these obtained 
by adjoining 0 and — 1 algebraically is a field with an ordering relation less 
than. The real numbers are defined in terms of set constructions by the 
Dedekind cut process. The reader is undoubtedly aware of the set-theoretic 
constructions that correspond to the notions of complex variable, functions, 
limits, and analytic geometry. 

Thus, analysis is established on a set-theoretic basis starting with the 
natural numbers. Previous mathematical developments are subsumed by 
mappings onto parts of analysis in such a way that “intuitive elements” are 
replaced by set-theoretic arguments. The exact sciences have math models 
that represent in this sense both geometrical and physical concepts. Quantita¬ 
tive aspects of reasoning are referred back ultimately to integers. 

Even the Peano axioms can be replaced by set theoretic constructions. 
Suppose that there is an object, a, that is not a set and we consider a set sd 
of sets A that has the following properties: 


(i) A x ={a)Gs/\ 

(ii) AG ja/-»{/4}E sl\ 






60 


Chap. 3 • Understanding and Mathematics 


(iii) (BE sf)C\(AE.B)~*B\ 

(iv) (&Cst)r\(A x e.9Dc\[(A e #)-({/!} e m ]- sJ= 

Intuitively, the set ^consists of the sets 

{a}, {{a}.}, {{{a}}}. 

and there is a one-to-one correspondence with the natural numbers 1, 2, 
3,.... However, we can readily show formally that sJ satisfies the Peano 
axioms. We sketch the discussion. 

Clearly l~{a}, and if ^4 — n, {A}~ri and we have Axioms 1, 2, and 5. 
To show Axiom 3 (i.e., ri ± 1) and Axiom 4 (i.e., n ' = = we show the 

following theorem: 

(AE s/)n(bEA)D(cEA)->b = c. 

Let 

M={A: (AE s/)n[(bEA)D(cEA)->b = c]}. 

But the definitions immediately yield A { EM and AEM implies {A} EM. 
Thus, (iv) above yields st— M, and hence the theorem. The Peano Axioms 3 
and 4 follow readily from the theorem. 

Thus, all analysis can be expressed in set-theoretic terms, assuming the 
existence of a single object. It is “logical” in the sense that the conceptual 
experience consists of “constucting sets,” a mental process in which objects 
that themselves may be mental are associated into a set. 

This set-theoretic analysis also leads to axiomatic developments. Axiom 
patterns can be abstracted from various parts of set theoretic analysis, with, 
of course, the set theoretic concepts themselves. The logical structure consists 
of arguments, symbolically described, corresponding to the abstracted 
patterns and set theory. This yields a very extensive mathematics, including 
abstract algebras, topologies, and linear spaces, i.e., graduate mathematics. 
These, of course, frequently parallel previous axiomatic developments. 

The basic drive behind this development was the demand for rigor. It is 
now widely accepted that set theoretic arguments are logically satisfactory. 
For axiomatic structures, in general, which are considered formats for 
experience rather than experience itself, freedom from contradiction is 
certainly satisfactory (see Landau (5) ). 


3.9. Modern Formal Logic 

There are those who object to the characterization of mathematics as an 
“intellectual development” and who wish to assign all essential significance 
to the symbolic procedure. Modern “formal logic” is based on the principle 
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that there are certain formats of words and statements that are true. For 
example, “4 included in B and C included in A implies C is included in R” 
The concern of “formal logic” as represented, say, in the Principia Mathemat- 
ica of Whitehead and Russell,* 10) is to set up an axiomatic development of 
these formats. The ingenious system of symbols of formal logic is widely used. 

This principle of formal logic involves the idea that the “truth” is 
associated with symbolic forms and procedures. Of course, the reason one is 
willing to accept such syllogisms as the above is that one can imagine classes 
of objects corresponding to certain denoted properties. Mathematics is 
usually presented in set terms with imaginative set constructions. But this 
principle requires that all such intuitive or imaginative elements be replaced 
by explicit symbolic formulations. The corresponding notation is familiar to 
the reader, for example, (x)(P(x)), (3x) (P(x)), and {x: P(x)}. 

Thus, the usual development of analysis can be represented in this 
notation, i.e., in appropriate symbolic form. But this requires a postulational 
justification for the various set constructions, and the obvious procedure is 
to set up a set of rules governing the construction of the symbols for sets. For 
example, if one has a property for which one can express the fact that this 
property holds for x by the symbol P(x), then one has the set {x: P(x)}. 

In the present context, this is equivalent to an axiomatic development of 
set theory. The postulates are the construction rules for symbols for sets. In the 
last century, Frege proposed such an axiomatic formulation as a fitting 
climax to the effort, sustained over two hundred years, to obtain a precise 
rigorous mathematics. 

The result was a catastrophe. Bertrand Russell showed that the axioms 
led almost immediately to a contradiction. Let — stand for the denial of a 
statement. Consider the set a = {x: ~(xExj}, i.e., the set of those constructs 
that are not members of themselves. Then oEo implies ~(aEo , )and ~(cfE<t) 
implies oEa. 

In general, the idea of applying set concepts to the set of sets leads to 
difficulty. The construction procedure must be much more sophisticated. 
Russell proposed a hierarchal structure that was not subject to the given 
difficulty and other construction procedures have been proposed. But this 
also developed problems of undecidability. 

These difficulties led to an analysis of the symbolic procedures of 
mathematics. In this analysis mathematical concepts are used on a frankly 
conceptual basis with abstractions from the symbolic procedures. As a 
consequence one has a mathematical treatment of general questions con¬ 
cerning the structure of mathematics such as consistency, categoricalness, 
and decidability. This procedure is called “metalogic.” 

It is not clear that one can obtain every aspect of mathematics by sym¬ 
bolic procedures alone without sharing conceptual experience. The symbol- 
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ism of mathematics is designed for communication and is considered suc¬ 
cessful if the designated intellectual experience is shared. The requirements of 
formal logic constitute an additional demand on the symbolism that it be 
associated with "truth” in the Platonic sense. 

Thus, the symbolism has the character of the higher unchanging logic 
that Proclus sought. Proclus objected to mathematics because its truth is 
conditional, i.e., the theorems require hypotheses and the conclusion is valid 
only when the hypothesis is satisfied. But the “true” statements in a symbolic 
formal logic are valid unconditionally. Thus, (p-»< 7 )<T(^-*r)-»(p->r) is 
always valid. It is not evident that this extra requirement will be universally 
accepted or how it can be satisfied. On the other hand this requirement does 
not appear necessary for applied mathematics. 

A formal logic development is axiomatic and can be considered to be 
the construction of a sequence of symbolic expressions. Constructive pro¬ 
cedures of this type can be subject to mathematical analysis in order to deal 
with questions of consistency, independence, and decidability. The experience 
patterns of formal logic can be abstracted and supplemented by set-theoretic 
analysis. This yields a sophisticated mathematics of great interest in itself 
that can be regarded as a conceptual format for experience. Thus, one can 
consider the modern theory of formal logic as mathematics without making 
the assumption that all logical experience must conform to it. 

Mathematics is frequently described as a language, and it is worthwhile 
to make this comparison. In both cases, one has symbolic communication 
between people relative to a common background of experience and the 
possibility of an imaginative projection of experience. Of course, in mathe¬ 
matics the symbolism is precise enough so that the usual ambiguities of 
language are eliminated. But communication is only one aspect of mathe¬ 
matics. Mathematics also involves an intellectual development consisting of 
abstracted patterns of experience. This development is the subject for the 
communication in mathematics and is more central than the communication 
aspect. Mathematics is equivalent to a language plus a literature. 

For further discussion of the material in this section, see Church,' 1 ’ 
Morrow,' 61 Styazhkin,' 81 Takeuti and Zaring,' 91 and Whitehead and 
Russell.' 101 


3.10. Pure and Applied Mathematics 

It is true that there are certain difficulties in applying set concepts to the 
collection of all sets. However, in regard to analysis there is considerable 
confidence that a satisfactory symbolic representation of the set-theoretic 
structure exists. Thus, starting with the existence of a single object, the 
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mathematician can construct the whole of mathematics, using no imagined 
procedure except the forming of sets. 

The mathematician exclaims “ C'est moi” and it is clear that mathematics 
is an exclusively intellectual development. This is indeed the way mathe¬ 
matics is presented in our graduate courses. The mathematician needs only 
mathematics in his research and lectures. Previous forms of mathematics are 
incorporated into the present one in a new rigorous form, and this independ¬ 
ence permits a wealth of development, which is of great interest and offers 
the greatest possible freedom for scientific exploration. The applied mathe¬ 
matician is welcome to use this logical development, but this use as such is not 
considered to contribute to the fundamental mathematical structure. 

But this independence definitely involves a certain isolation. Most areas 
of applications involve, conceptually, earlier forms of mathematics and are 
still taught in this way. The user develops a familiarity with some such 
development and may be unaware of the corresponding “pure” or modern 
mathematics. His mathematics may correspond to the early nineteenth 
century or to one of the two forms of the eighteenth century. These, of course, 
have been subsumed into modern mathematics, but the reward to the user for 
mastering a new and demanding intellectual discipline is frequently just the 
assurance of rigor and not necessarily any increase in capability. 

Indeed, the increase in capability in the sense of obtaining new computa¬ 
tional procedures may occur outside the framework of accepted mathematics, 
as for instance, the Heaviside operational calculus or the Dirac delta function. 
Mathematics has been expanded to include a justification for these. This is an 
intellectual triumph and represents some increase in applicability, but it also 
shows that intellectual restrictions are basically unjustified. 

One must then anticipate the possibility that the user of applied mathe¬ 
matics will function intellectually in a way that is independent of “pure” 
mathematics, while the mathematician will learn his mathematics in the 
latter framework. Technical developments frequently make it desirable to tap 
the rich vein of available mathematics. In most applied areas there is a 
relatively standard mathematics, which is normally adequate, but the explo¬ 
sive development of technology may require an expansion in the mathematics 
used. The use of differential equations was tremendously expanded by the 
availability of automatic computation, but this expansion required a new 
theoretical base in existence, and uniqueness results from pure mathematics. 

The applied mathematician must make such contributions, and he must 
also involve his mathematics in the general understanding of the situation. 
But this requires a somewhat different intellectual background than that for 
pure mathematics, and indeed the development of such an intellectual back¬ 
ground can be a very satisfying professional activity. One must appreciate 
that the present conceptual structure of mathematics is the result of a historic 
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process in the last hundred years, motivated by an overriding concern with 
logical precision. Previous phases of mathematics have been an integral part 
of what is termed the “scientific revolution” and were concerned with develop¬ 
ing the power of mathematics. There are clear indications that initially the 
fascination of the conceptual development itself was the major element. 

Our text contains an introduction to these matters but by no means a 
complete account. Our next objective will be to provide a historical frame¬ 
work in which the present development of the exact sciences and mathematics 
will appear in perspective. The pure mathematician who does not want 
mathematics to be considered to be in an isolated subculture may also find 
this of interest. 


3.11. Vocational Aspects 

It is desirable to summarize the significance of this chapter from the 
vocational point of view of the applied mathematician. The understanding of 
the situation that is the target of the effort is not a “body of knowledge” but a 
procedure for coping with the situation. Part of this procedure is the effective 
use of the math model and computation. The appropriate role for the applied 
mathematician is the responsibility for incorporating this effective use and 
computation into the overall procedure. 

This requires an “understanding" of the basis for the math model in the 
sense that one must appreciate the experience patterns that correspond to the 
concepts that are used in handling the situation. Information on relevant 
past experience and desirable and undesirable results are expressed in these 
concepts. “Facts” do not exist independently of a conceptual framework of 
past experience. 

The type of understanding described in the previous paragraph gener¬ 
ally involves a considerable effort to comprehend technical reports, especi¬ 
ally in regard to their quantitative significance. This may seem overemphas¬ 
ized, but there have been simulations in which the computed results were 
entirely irrelevant to the situation. In the early days of digital computation, 
an effort was made to function on the principle that the user group would 
prepare a mathematical formulation of the problem and a group associated 
with the computer would solve it. Presumably, this would lead to a most 
efficient use of the computer, but the total result was disastrous on occasion. 
The difficulties were compounded by a number of factors, but the major 
weakness was the limitation on the conceptual communication. 

Communication must be in terms of the mathematical and technical 
procedures of the group, and the mathematician normally must exert con¬ 
siderable initiative to establish this. Certainly he cannot wait until someone 
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else ‘‘expresses the problem in mathematical terms.” The asset the mathe¬ 
matician has is the tremendous amount of mathematics available. But he 
will have to make the conceptual connection. This means he must develop a 
scientific understanding associated with his mathematics that is not readily 
available academically. He must develop an appreciation of the historical 
dimension in general, as well as in the specific situation he is dealing with. 


Exercises 

Term Project: (see p. 8): In your term project, describe the situations with which 
one deals, the concepts used in describing the situation, the experience patterns used 
for prediction and the mathematical formulation used for them, the conceptual dev¬ 
elopment used for reasoning, the action options available, and the comparisons used 
for decisions. 


3.1. For each of the experiences listed below, describe (i) the associated areas of 
experience, (ii) the possible favorable and unfavorable aspects, (iii) the experience 
patterns of actions and consequences that we take into account, (iv) the extent and 
general character of our awareness of these experience patterns, (v) the communication 
processes involved, and (vi) whether the situations involved are present, future, antic¬ 
ipated, obtainable, or avoidable. 

(a) Taking a course at a university (g) 

(b) Employment 


(c) Shopping in a supermarket (h) 

(d) Learning to drive a car with a 

manual shift (i) 

(e) Seeing a stop sign at the next (j) 

intersection 

(f) Seeing a state police car with a (k) 

radar device on the highway (1) 

ahead (m) 


Paying one's annual 
federal income tax 
Watching a play or motion 
picture 

Playing chess or bridge 
Gambling with dice or betting 
on horse racing 
Participating in sports 
Reading 

Having thirteen guests for 
dinner 


3.2. In discussing each area of Exercise 3.1, various nouns are used. In which cases 
do these nouns correspond to the names of concepts that are classical logical forms? 
What attributes associated with these forms are relevant to the discussion? In w'hich 
cases is there a definition associated with these nouns that indicates a pattern of experi¬ 
ence or an attribute? When docs the definition permit a range of conceptual experience? 
What is your basis for understanding the form or definition? 

3.3. Describe the different ways in w hich we become aware of situations of interest 
to us. What information is available to us w ithout any action on our part, what informa¬ 
tion is due to involuntary actions on our part and what information results from deliber¬ 
ately directed actions on our part? Are these distinctions always clear cut? 

3.4. In dealing with a situation do you list the possible actions you could take, i.e., 
your “options”? Do you think you should? 

3.5. Have you ever had occasion to change your ideas about some aspect of experi¬ 
ence? Can you describe the sequence of events that led to this change? Can you diagram 
the corresponding experience? 
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3.6. At the end of Section 3.1, there is a list of possible criteria for a body of knowl¬ 
edge. Expand and discuss this list and Pilate's famous question, “What is truth?" 

3.7. List the comprehension concepts of geometry, physics, astronomy, chemistry, 
and biology. How does one learn the meaning of these concepts? 

3.8. Describe the procedural definition of length in physics using the standard 
definition. Investigate the technological applications of this definition. What is the 
procedural definition of distance in astronomy? Is there a procedural equivalence 
between the concepts of length and distance? 

3.9. If the result of a measurement is a real number, the choice of scale for the 


measurement is usually arbitrary. Describe dimension theory and its significant applica¬ 
tions in aerodynamics, thermodynamics, and electromagnetism. 

3.10. Various modern engineering procedures have been based on an analogy 
between electrical circuits and mechanical systems. What is the mathematical formula¬ 
tion corresponding to the concepts involved? What concepts are more readily available 
in one system than the other? 

3.11. Descriptive biology is closely associated with certain concepts derivable as 
formal concepts of classical logic. Describe the development of the associated “classif¬ 
ication" systems and the dependence of the recognition procedure on the point of view 
of the originators and the availability of techniques. Describe the associated diagram¬ 
matic representation. How was the conceptual structure affected by developments in 
physics, chemistry, and by the theory of evolution? 

3.12. Describe the various milieus of experience associated with classical mech¬ 
anics. How did this experience develop historically? How did the quantitative concepts 
develop and what are the now accepted mathematical relations involved? What are the 
present situations of technical interests? What are the mathematical procedures used for 
analysis and prediction? Answer the same questions for classical electrodynamics and 
thermodynamics. 

3.13. The following mathematical topics are associated with various areas of 
applications. 

(a) Ordinary differential equations 

(b) Partial differential equations 

(c) Fourier series 

(d) Fourier transforms 

(e) Probability distributions 

What are the comprehension concepts associated with the various applications? Name 
and describe the mathematical procedures used for analysis and prediction. Describe 
the formulation of the output information required and its relation to comprehension 
concepts. 

3.14. Classify the concepts that appear in our dealing with matter in its solid form. 
In which cases are quantitative measures associated with these concepts? Notice that in 
certain areas of experience weight is considered a direct measure of quantity, but in 
elementary physics weight is simply a force, and quantity is associated with inertial 
mass. 

3.15. Define the concepts “liquid" and “gas" in terms of experience. What is the 
relation to the notion of “solid"? How would one describe the general notion of matter? 
The sciences of physics and chemistry “explain the nature of matter." What does this 
mean in terms of the quantitative concepts discussed in this and the preceding exercise ? 

3.16. In modern physics an elementary particle “has both particle properties and 
wave properties." What is the experimental and mathematical meaning of this statement ? 
What are the associated concepts? 


(f) Linear operators in Hilbert 
space 

(g) Linear functionals 

(h) Integral equations 
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3.17. Describe the experimental verification of Newton’s law of gravity. It is usual 
to consider that Newton's law of gravity and the associated mechanics applies to a 
limited mileu within the area of general relativity. Describe this idea mathematically and 
describe the associated experimental verification. Do this for the similar relation 
between ’’geometric" and "physical optics. Also for classical and Quantum electro¬ 
dynamics. Also for classical and quantum thermodynamics. 

3.18. Diagram the relation between the various topics that are "exact sciences” 
in our sense in physics. Indicate how comprehension concepts structure the relations 
between them. Indicate how measurement procedures vary within these concepts and 
how geometric and space-time concepts are used to unify. 

3.19. What is the relation between counting, keeping accounts, and arithmetic? 
What is the fundamental concern? What are the comprehension concepts and experi¬ 
ence patterns used to translate various business activities into a form to which arithmetic 
is applicable? What are the action decisions resulting from this process and what are the 
desirable results? What is the meaning of the terms double entry bookkeeping, assets, 
debits, capital, purchase, sales, inventory, production, cost, cost accounting, gross and 
net return, and growth? What is the relation with arithmetic? What arithmetic checks 
are used? 

3.20. Describe verbally the patterns of arithmetic experience that are expressed 
symbolically in algebra. 

3.21. Geometry has been described as the idealization of spatial experience. Its 

name and certain historical evidence indicates that this spatial experience was surveying. 
What are the concepts used in surveying? In what sense are they idealized ? The develop¬ 
ment of mathematics has added many more geometrical concepts. Describe them and 
their corresponding spatial experience. w 

3.22. Bertrand Russell in his essay. 171 "Mathematics and the metaphysicians." 
states, "Pure mathematics was discovered by Boole, in a work he called the ‘Laws of 
Thought’ (1854).... His book was in fact concerned with formal logic, and this is the 
same thing as mathematics." Outline the basis for Russell’s statement by references to the 
literature. Is there a conceptual framework inherent in the establishment of formal logic ? 
Russell also states that, “The propositions [of logic] can be put in a form to apply to 
anything." How does Russell prove that 2 + 2 = 4 is a theorem of formal logic? How does 
this apply to four apples? Two apples and two pears? Milk ? Bubbles? What concepts did 
you add to make the theorem applicable? Is the result always significant? What is the 
relation with the question: is the milk sour? 

3.23. Under the influence of Newtonian particle physics, certain early theories of 
elasticity were based on the assumption that matter consists of "atoms” in certain 
arrangements with forces acting between them. In "mathematical elasticity, matter is 
supposed to be homogeneous but not isotropic, and concepts of stress and strain are 
defined. What common concepts permit a theoretical comparison and an experimental 
comparison of the two theories? The texts on elasticity insist that the first theory is 
"essentially different from modern atomic theories.” Explain the difference in mathe¬ 
matical theory. What are the ranges of experience associated with this difference? 

3.24. Describe the aspects of the following, which involve the construction of sets: 
(a) a topology on a set, (b) closure, (c) complementarity, (d) union, (e) intersection, (f) 
Bairc sets, (g) filters, (h) maximal filters, (i) all subsets, (j) completion, (k) measurable sets, 
(I) sets of measure zero, (m) nonmeasurable sets, (n) metric spaces, (o) continuity, (p) 
Zermelo’s axiom. When is the constructive aspect described axiomatically ? When do you 
think it should be? 

3.25. In the usual development of the theory of finite groups, what theorems from 
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the theory of finite sets are used? Where are these proven? Give a complete axiomatic 
basis for the theory of finite groups. In the theory of a finite dimensional algebra over a 
field K , what structures from set theory are added to the properties of K? Give a com¬ 
plete axiomatic basis. Banach's original treatment of linear spaces was based on a 
sequence of “definitions" in each of which additional mathematical apparatus was 
introduced on what was initially a “linear vector space." Describe these spaces axiomat- 
ically. Banach also uses the concept of “transfinite induction.” What are a complete set 
of axioms for this notion? How are these related to Zermelo's axiom and Zorn's lemma? 
What is the relation of Hilbert space to the spaces described by Banach? 

3.26. An important initial part of the development of analysis is concerned with the 
properties of continuous functions of one real variable on a closed finite interval: these 
properties permit the definition of the Riemann integral. What property of a closed 
finite interval that is not valid for the set of all real numbers is crucial in this discussion? 
What set constructions and axioms arc used in establishing it? What logical assumptions 
are involved in its use? 

3.27. In linear space theory one is usually concerned with linear sets or closed 
linear sets, and the “element of' relation is replaced by an inclusive relation of a one¬ 
dimensional subset. The concepts of union, intersection, and element of for ordinary 
sets in general satisfy certain algebraic relations and constitute, therefore, a “Boolean 
algebra." The union of two linear sets is in general not a linear set. One can, however, 
obtain an algebraic structure associated with linear sets called “lattice theory." Describe 
the axioms of lattice theory. How r are these applied to linear or closed linear sets? 
Contrast these with Boolean algebra. Obtain Boolean relations that fail in the lattice 
case. Boolean algebra is associated with elementary formal logic. What would be the 
result if one used lattice theory instead? What difference would it make if one used 
translates of linear sets instead of just the linear sets? 

3.28. Give the formal symbolic arguments for the proofs in Section 3.8 of state¬ 
ments 1 and 2 concerning the set theoretic formulation of the natural numbers. Under¬ 
line purely set-theoretic assumptions. 

3.29. Formalize elementary number theory, starling with Peano's axioms and 
obtaining the theorem on unique factorization. 

3.30. One important meta-type argument is Godel’s proof of the existence of 
nonprovable statements in certain types of formal logic. Study this from the point of 
view of how various mathematical concepts appear in the argument. 

331. What analytic and algebraic concepts are applied to geometry in Euclidean, 
affine or projective geometry, algebraic geometry, combinatorial topology, and differ¬ 
ential geometry? What geometric concepts are relevant to the implicit function theorem, 
the theorem on functional dependence? 

3.32. Certain concepts of circuit theory, especially those associated with the time 
behavior of electrical quantities, have been associated with the Laplace transform. 
Describe them. What other mathematical concepts are used in circuit theory? 

3.33. Diagram pure mathematics in a form in which independent axiomatization 
corresponds to a process and the topics are the results. Associate geometry and the 
various scientific theories with this diagram. 

334. What problems were considered by Archimedes? Cavalieri? Wallis? Newton? 
Euler? Laplace? Gauss? Cauchy? Riemann? Weierstrass? Cantor? Borel? Lebesgue? 
Hilbert? L. Schwartz? 

335. Consider any elementary development of group theory. Should elementary 
arithmetic be included in the axiomatic basis? 

3.36. Beginning with telegraphy, describe the relationship of scientific understand¬ 
ing and communication. 
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Ancient Mathematics 


4.1. Ancient Arithmetic 

We are interested in the improvement of mathematical understanding. 
Mathematical understanding is a necessary support for most complex 
cultures and is usually incorporated into both basic and technical education. 
Thus our present mathematical education has a layer structure, with the 
lower layers corresponding to the most widespread needs. In general the 
mathematical education of a culture is an indicator of its technical aspects. 
Since educational material tends to survive because there is so much of it, 
it is an excellent basis for the study of the growth of mathematical understand¬ 
ing. 

There is a superficial correspondence between the historical develop¬ 
ment and the present layers of mathematical education, but this is misleading 
unless checked against the historical record. The historical record is much 
more revealing of the growth of understanding that was due to the activities 
of mature people in definitely nonscholastic environments. Mathematics 
itself has been continuously readjusted even in its most elementary aspects 
so that the main emphasis in the study of the historical development must 
deal not with the addition of layers but with the many metamorphoses 
of the conceptual structure. Our elementary arithmetic in its present form 
does not predate 1650 a.d., and yet it has had an evolutionary development 
that certainly began before the dawn of history. 

The initial basis of arithmetic is counting. This capability certainly 
preceded the keeping of written records. The effective use of counting norm¬ 
ally requires the aggregation of unit objects into larger units. Such aggrega¬ 
tion is indicated in the names of numbers like twenty, thirty, or forty, and 
aggregation must have been a part of counting this high. Language also 
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contains many examples of aggregation such as dozens, stones, scores, etc. 
that serve practical purposes. The elementary operations of addition and 
multiplication are further consequences of aggregation and are usually 
associated with a further step in which one has systematic and repeated 
aggregation, for example, the aggregation into tens, hundreds, thousands, etc. 

The earliest written records dealing with arithmetic are from Egypt 
and Babylonia. There are a number of manuscripts from Egypt, the most 
important of which is the Rhind papyrus. It was written before 1700 b.c. by a 
priest, Ah’mose, and was presumably used for instructional purposes since 
much of it is in the second person. This document is translated and discussed 
in The Rhind Mathematical Papyrus by Chace et al. (2) 

Another ancient papyrus, referred to as the Moscow papyrus, is dis¬ 
cussed in “Mathematische Papyrus des staatlichen Museums der schonen 
Kiinste" by Struve. (11) Ancient Egyptian arithmetic is discussed in “Mathe¬ 
matics in Ancient Egypt" by Peet (10) and in Science Awakening by Van der 
Waerden. (12) Mathematical tablets from Babylonia are treated in Mathe¬ 
matical Cuneiform Texts by Neugebauer and Satz. (9) 

In the Rhind papyrus, natural numbers are expressed in a decimal 
system, which is not a place system but one in which different symbols are 
used for 1, 10, 100, etc. For example, one could use | to correspond to one, 
H to ten, and C to a hundred, and 241 would be written | nnnnCC. The 
rule for addition in this notation is obvious. However, the calculation with 
natural numbers is more or less assumed in this document and numbers 
used are “mixed numbers," that is, natural numbers plus a fraction. This 
fraction is expressed as a sum of terms in the form 2/3 or 1/n, n = 2, 3, .... 
Obviously, the conceptual basis for this form of expression is quite different 
from that for our usual numerator-denominator fractions but precisely 
what it is is not clear, except that a process of choosing a smaller unit is 
involved. 

Multiplication of natural numbers was based on a process of repeated 
doubling, i.e., of adding a number to itself. Suppose one wishes to multiply 
231 by 19. One doubles the multiplicand repeatedly, writing the result beside 
the corresponding multiplier: 


1 

231 

2 

462 

4 

924 

8 

1848 

16 

3696 

19 

4389 


One stops with the highest multiplier that does not exceed the multiplier of 
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the original problem. One then checks the multipliers that add up to the 
original multiplier (indicated in boldface above) and adds the corresponding 
multiples of the multiplicand. This yields the product, 4389. 

The division procedure is similar. Suppose one wishes to divide 2783 by 
39. The procedure is based on doubling the divisor 


1 

39 

2 

78 

4 

156 

8 

312 

16 

624 

32 

1248 

64 

2496 

71 

2769 


until the next double would exceed the given dividend. One then adds a 
selection of these multiples until the sum is less than the dividend by a quant¬ 
ity less than 39. This sum, 2769, corresponds to 71 x 39 with a remainder of 
14. One can continue the multiplication of 39 with fractions: 


71 + 3 + 35 2783 

This Egyptian procedure was utilized for many centuries. 

In modern terms, this procedure involves expressing the multiplier in 
binary form, but it would perhaps be more appropriate to say that the simplest 
form of multiplication is to double a quantity. 


4.2. Egyptian Mathematics 

In Ah'mose, the operations on the integers are assumed to be available. 
The main concern is with operations on mixed quantities. For these a number 
of problems are apparent in order to obtain an effective arithmetic equivalent 
to that for the rational numbers. Quite a number of applications are dis¬ 
cussed in the Rhind papyrus and the author is clearly interested in the 
procedures as much as in the problems themselves. 

One problem is to express 2/w, where n is an odd number, as a sum of 
fractions with numerator 1 (this is required for addition and for multiplica¬ 
tion since multiplication is based on adding a number to itself). One can 






74 


Chap. 4 • Ancient Mathematics 


always solve this problem since 

2 1 1 

2fc + i-*TT + (fc+iX2fc+i)’ 

but in general this is not the preferred solution, since the denominator in the 
last term may be large. The Rhind papyrus begins with a discussion that 
obtains the appropriate expression for 2/n for w = 5, 7,..., 101, and different 
procedures are used in various cases so that one has the equivalent of a 
mathematical investigation in which the terms l/n are clearly entities. Other 
problems are to add a number of such expressions when the sum exceeds 
1 or to subtract such an expression from 1. The procedure usually involves 
choosing one of the denominators and using the corresponding fraction as a 
new unit in terms of which the other fractions are expressed, possibly as 
mixed numbers. For example, 

H + ^ + H( 4i+3+,i+i H( 9+ rt) 


The basis of the Egyptian economy was the production of grain, which 
was consumed either as bread or beer. The equivalence between units of 
these commodities is based on the amount of grain in them, and this leads to 
a number of problems in proportion. Plans were made on an annual basis, 
e.g., the amount of grain needed for a certain number of people was estimated. 
Another type of problem involved a division of supplies on an unequal basis, 
which leads to a problem in arithmetical progression. The relative values of 
precious metals produces problems of exchange. If one wishes to anticipate 
the amount of grain available from a given field, one must estimate the area. 
There is also the matter of taxes. The storage of grain raises questions con¬ 
cerning the volume of a corn crib whose dimensions are given. 

The problems associated with addition are indicated above. Addition 
will yield multiplication, but division (or its equivalent) requires further 
consideration. For example, suppose one is asked to find a quantity such that 
itself and one-seventh of itself total 19. The number, 1 4*4, is considered to be 
the operation of multiplying by this quantity. When applied to 7, it yields 8. 
Now 8 is contained 2+4 + 4 times in 19. Thus, the unknown quantity must 
contain 7 the same number of times, i.e., the answer is 16 4- 4 4- 4- The number 
1+4, therefore, is handled as a proportion. Other examples given in the 
papyrus correspond to division by 1+14-4 + 4’ 1 + 4+4’ and 3+4- 

Rectilinear areas are readily handled in the rectangular case. Circular 
area and cylindrical volume are effectively handled from the practical point 
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of view. If c/is the diameter, the area of a circle is taken to be (ft/) 2 . There are 
also problems in proportion associated with the various measurements of a 
pyramid. Certain problems deal with numbers as such, but others are ex¬ 
pressed in terms of measures of grain, loaves of bread, measures of beer, etc. 
We have, therefore, an arithmetic, but the association with the practical 
situation is close. 

The obvious purpose of the manuscript is to describe procedures for 
solving problems of various types. The elements of the problem are given 
numerical values, and operations on these are described in terms of these 
values. The directions for the procedure are given in the second person. You 
are told to do this or that with these numbers. For example, the student is 
told that when he has reached the exalted estate of scribe and someone 
requests the area of a circular field that is 9 cubits across, then he is to take 
4 of the 9 cubits, which is 1, and subtract it from the 9 cubits and square the 
result. Lest the feeble-minded student confuse the 9 cubits with the 4 that 
appears, the problem is repeated with a diameter of 10 cubits. 

Thus, the notion of an algorithm is clear and also the notion of an arith¬ 
metic procedure separated conceptually from the problem. There is, of course, 
no formulation in algebraic symbols, no symbol for an '‘arbitrary number,” 
and no symbolic method for handling this concept. 


4.3. Babylonian Mathematics 

Babylonian mathematics differs from the Egyptian in a number of 
aspects and probably should be considered as more sophisticated. Their 
arithmetic also involved mixed numbers, but they are expressed in the 
sexagesimal system, and the fractional part corresponds to a fraction with a 
denominator in the form 2 p 3*5 r . The sexagesimal system was a place system 
in which the numbers in each place were expressed decimally. Thus, one 
needed only addition and multiplication tables for the digits, 1,..., 9, and for 
10, 20,..., 50. But there were ambiguities insofar as there was no zero digit or 
sexagesimal point, and these ambiguities had to be resolved from the context. 

Thus, the Babylonian arithmetic was quite different from the Egyptian 
described above. Procedures were much more formally organized around 
tables. For the purposes of division, reciprocals were used. The most element¬ 
ary tables would list the “regular numbers,” i.e., those in the form 2 p 3*5 r and 
their reciprocals. These tables would therefore not contain reciprocals for 
7, 11, 13, etc. But there were also more advanced tables containing approx¬ 
imations to the reciprocals of numbers with these primes as factors and also 
reciprocal tables obtained by repeated doubling and halving from a given 
pair. Multiplication tables contain the multiples 2c, 3c,..., 19c, 20c, 30c, 40c, 
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50c for given numbers c. This is not the minimum necessary for multiplication, 
but it was probably more convenient. The article by Neugebauer and Satz (9) 
describes the available tablets and the form assumed by these multiplication 
tables, some of which are quite extensive. There are also tables for squares, 
square roots, and cube roots, and these are highly significant for the kind of 
problems dealt with. There are also tables of powers. 

Geometric problems are associated either with areas, as in surveying 
problems, or with volumes, as in problems concerning the excavation of 
canals and irrigation ditches. The effort needed is estimated in terms of 
man-days and the cost in the form of the corresponding wages to be paid in 
silver. The excavation volume is divided into three layers, and the work for a 
given volume depends on the depth of the layer. Other volume problems 
involve the number of bricks in a pile of given volume. Problems tend to have 
an economic aspect, which one would expect if they are the concern of 
administrators. Area problems of nonrectangular areas are handled only 
approximately. 

Problems are presented in concrete terms, and algorithms are indicated 
in terms of operations on specific numbers when the procedure for solving 
problems is explicitly given. The solution procedure is in the second person. 
There is a wide range of problems in which one seeks for two unknown quan¬ 
tities—“length” x and "width” y —for which the “area” xy is given and a 
linear combination described by a sequence of numerical operations is given. 
We would deal with such a problem by solving the linear combination for 
one variable, substituting in the other equation, and solving the quadratic 
equation. The solutions given follow precisely the formula for solving the 
resulting quadratic equation, but there is no hint as to the intermediate 
process. It has been suspected that the relation [(x + y)/2] 2 = xy+ [(x—y)/2] 2 
was inferred on geometrical grounds, possibly in the form (a + b) 2 = 4afe 
+ (a — b) 2 . Thus in Figure 4.1, this relation becomes immediately obvious if 
one moves the area I to the position II. But the range of problems requires 
some more sophisticated procedures. Notice that the numerical procedure 
corresponding to the formula for solving quadratic equations requires square 
roots, and, of course, tables for both the square and square roots were used. 
There are also straight linear problems and certain rather ubiquitous arith¬ 
metic series problems. 

A very interesting table is one that gives a sequence of triplets of numbers 
that satisfy / 2 + b 2 = d 2 arranged so that d 2 /l 2 is increasing. Apparently these 
triplets were obtained by means of the formulas / = 2 pq, b = p 2 —q 2 ,d = p 2 + q 2 , 
where p and q are “regular” numbers, that is, have precise reciprocals in the 
sexagesimal system. There is no indication as to the purpose of this table, but 
a modern table of trigonometrical functions is also a table of right triangles. 
The ancient table is a table of right triangles arranged according to the secant. 
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Figure 4.1. Geometric algebra construction. 


The sexagesimal system of Babylonia has survived in our measurements 
of angles and time. There are historians who believe that the geometric 
algebra of the Greeks developed from Babylonian procedures. Astronomy 
can be readily traced back to Babylonia. The Greeks claimed Egypt and 
Babylonia as ancient sources for their mathematics. The methods of the 
latter are certainly more than primitive, but the step to the classical mathe¬ 
matics of the Greeks was enormous. 

For further discussion of the material in this section, see Neugebauer and 
Satz. (9) 


4.4. Greece 

The historical development of ancient civilizations is a fascinating 
complex interaction of many elements, including trade, wars, and inventions. 
One of the most important of the inventions was that of the alphabet by the 
Phoenicians. This was the core element that integrated the culture of the 
Greek cities of the Mediterranean. 

We must limit ourselves not only to mathematics but even to certain 
aspects of mathematics, in particular its conceptual character. One classical 
development was that of centers of higher education. Formal education for 
the well-to-do was by tutors in its elementary stages. But for those who wanted 
to go further there was travel and certain centers of learning where one could 
study one or more subjects, such as rhetoric, law, philosophy, and, most 
universally, mathematics. 

In Athens around 400 b.c., Plato founded the Academy, which continu¬ 
ed until 529 a.d., when it was closed by an edict of Emperor Justinian. Al¬ 
though the Academy did not have the organization of a modern university, it 
had formal lectures, produced handwritten manuscripts, held property, and 
had a rather respectable endowment. Aristotle founded a rival institution 
called the Lyceum. In Alexandria there was the famous Museum, or Temple 
of the Muses, with a library and staff of scholars supported for many centur- 
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ies by the Ptolemies and later by the Roman emperors. The mathematics of 
classical times is associated with Alexandria in one or more ways. 

Mathematics played an introductory role in the classical higher educa¬ 
tional program. The name itself is from the word for learning, Kxdrjma , 
and comprised geometry, arithmetic, mechanics, astronomy, optics, geodesy, 
music, and calculation. According to the Greeks, geometry was first dis¬ 
covered by the Egyptians and was used by them to resurvey their land after the 
Nile inundations. The Phoenicians are associated with number theory 
because of the necessities of trade and exchange. 

Men like Thales of the city of Miletes (625-545 b.c.), who traveled to 
Egypt and Mesoptamia and studied there for a number of years, were 
credited with introducing mathematics to the Greeks. The names of various 
early mathematicians are known from later references, but the mathematics 
of this initial period is known to us only in the form of Euclid’s Elements. 
Euclid (330-250 b.c.) lived in Alexandria in the time of the first Ptolemy. 
His Elements apparently played the role of a basic textbook and also a treat¬ 
ise of fundamental results on which the more sophisticated works of 
Archimedes and Apollonius were based. 

An appreciation of classical mathematics is best obtained from com¬ 
mentaries directly associated with translations of the original works when 
they are available. Probably one should begin with The Thirteen Books of 
Euclid's Elements by Heath. (5) We have mentioned Van der Waerden’s (I2) 
book, which discusses the attainments of Greek mathematics in detail and 
with their procedures of proof. Much of our information concerning ancient 
mathematics is in the commentary of Proclus, who also represents a Platonic 
viewpoint many centuries after Plato; see Proclus: A Commentary on the 
First Book of Euclid's Elements by Morrow. (7) Archimedes’ work is discussed 
in Archimedes by Dijksterhuis. (4) The successes and limitations of Greek, 
mathematics from a modern point of view are discussed in The Role of 
Mathematics in the Rise of Science by Bochner. (1> 


4.5. Euclid’s Elements 

Let us now consider the various aspects of Euclid’s Elements that do not 
appear in the available versions of previous mathematics. First, there is the 
logical structure. There is a step-by-step development beginning with “first 
principles” and yielding a sequence of proven statements. Euclid’s Elements 
presents an abstraction from spatial experience, termed “geometry,” which 
involves two new characteristics. One of these is constructive and contains 
many results of great practical importance. But geometry deals with magni¬ 
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tudes, and magnitudes require a subtle extension of the basic conceptual 
complex of objects, sets, and natural numbers. The notion of relative size of 
two magnitudes of the same kind may not be expressible as a ratio of natural 
numbers. 

From Euclid on, the logical aspect has been the distinguishing character¬ 
istic of mathematics and geometry, the bridge between elementary and 
“higher” education. The “first principles” are presented in three forms— 
definitions, postulates, and “common notions” or axioms. The main burden 
of expressing the conceptual structure is placed on the definitions. These 
occur at the beginning of every book, except Books VIII, IX, XII, and XIII 
of the Elements. In many cases they are not simply definitions in the modern 
sense of abbreviations or names for previously developed concepts. Instead 
they are efforts to describe the conceptual structure on which the discussion 
is based. They formulate the abstraction process by which spatial experience 
is condensed into certain essential elements and is expressed by diagrams. 

Thus, we have the definitions for a point (1.1. A point is that which has no 
part), for a line (1.2. A line is a length without breadth), for a surface (1.5. 
A surface is that which has length and breadth only), and a straight line 
(1.4. A straight line is a line which lies evenly with the points on itself). Some¬ 
what similar is the definition of an angle (1.8. A plane angle is the inclination 
to one another of two lines in a plane which meet each other and do not lie 
in a straight line); this definition is also specialized into the notion of recti¬ 
linear angle, i.e., one whose sides are straight lines. Rectilinear angles in Euclid 
are treated as magnitudes. In modern geometric treatments, these notions 
would be considered “undefined terms” with postulated relations between 
them. 

Other definitions explain relationships (1.3. “The extremities of a line 
are points”). There is, of course, a large number of definitions describing plane 
figures and spatial configurations, for example, right angle, acute angle, 
obtuse angle, plane figure, circle, diameter, center, semicircle, triangle, and 
quadrilateral. 

The constructive geometric development is based on definitions and on 
the postulates. There are five postulates. Postulates 1,2, and 3 state that one 
can construct a line segment with given end point, a straight line through two 
points, and a circle with a given center and radius. Postulate 4 states that all 
right angles are equal. Postulate 5 is, “If a straight line falling on two straight 
lines, makes the interior angles on the same side less than two right angles, 
the two straight lines, if produced indefinitely, meet on that side on which 
are the angles less than two right angles.” 

At first glance, Postulate 4 seems to be similar to the proven statements, 
and since the logical discussion needs a starting point, this would seem to be 
its role. Postulate 5 appears to be another “constructive” postulate, i.e., it 
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permits one to construct a point. However, the situation is more subtle than 
this and we will return to it. 

The geometry based on these definitions and postulates is a sequence of 
propositions. A discussion of mathematical reasoning and the logical struc¬ 
ture of propositions is given in Proclus (see Morrow* 7) pp. 159ff.; the numbers 
in the margin refer to the Friedlein text and references to Proclus in Heath (5) 
also follow this text). In F. 206 he tries to fit the reasoning to the syllogistic 
form and also “cause and effect" in the sense of “essential cause." But in F. 
207 he plunges into an entirely different explanation. 

Furthermore, mathematicians are accustomed to draw what is in a way a double 
conclusion. For when they have shown something to be true of a given figure, they infer that 
it is true in general going from the particular to the general conclusion. Because they do not 
make use of the particular qualities of the subject but draw the angle or straight line in order 
to place what is given before our eyes, they consider what they infer about the given angle or 
straight line can be identically asserted for every similar case. They pass therefore to the 
universal conclusion in order that wc may not suppose that the result is confined to the 
particular instance. This procedure is justified, since for the demonstration, they use the 
objects set out in the diagram not as these particular figures but as figures resembling others 
of the same sort. It is not having such-and-such a size that the angle before me is bisected, but 
as being rectilinear and nothing more. Its particular size is a character of the given angle, but 
its having rectilinear sides is a common feature of all rectilinear angles. Suppose the given 
angle is a right angle. If I used rightness for my demonstration, 1 should not be able to infer 
anything about the whole class of rectilinear angles: but if I make no use of its rightness and 
consider only its rectilinear character, the proposition will apply equally to all angles with 
rectilinear sides. 

A complete proposition according to Proclus (F. 203) should contain 
"an enunciation, an exposition, a specification, a construction, a proof and 
a conclusion," although the most essential parts are the enunciation, proof, 
and conclusion. The "enunciation" is, of course, the statement of the proposi¬ 
tion. 


The exposition takes separately what is given and prepares it in advance for use in the 
investigation. The specification takes separately the thing that is sought and makes clear 
precisely what it is. The construction adds what is lacking in the given for finding what is 
sought. The proof draws the proposed inference by reasoning scientifically from the proposi¬ 
tions that have been admitted. The conclusion reverts to the enunciation, confirming what 
has been proven. 

Thus, geometrical reasoning is an abstracted intellectual experience in 
which the diagram provides both a symbolic record and support for the 
imaginative development. The general statements are given a specific diagram 
interpretation in the exposition and specification. Our intellectual experience 
is with the specific diagram. Further elements of the diagram are then intro¬ 
duced in the construction, which must be based on the postulates (F.209. 
In general, the postulates contribute to the construction and the axioms to 
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the proofs). The proof then follows by considering the properties of the 
completed figure as known either from the axioms, the postulated properties 
of the constructed elements, or previously proven propositions. The choice 
of the constructed figure may represent considerable ingenuity and analysis 
of the geometrical relations. 

The constructive geometry of Euclid included the theory of congruent 
figures and the construction of the regular polygons of three, four, five, and 
six sides and the five regular polyhedra. We have mentioned the geometric 
algebra of the Sumerians in which algebraic problems are formulated in 
geometric terms. The corresponding procedures are, of course, available in 
Euclid. But now Euclid can present constructive versions of the algorithms 
and prove the desired relations. 


4.6. Magnitudes 

In our experience, it is normally necessary to deal with the notion of 
amount. We buy some things by length, others by area, volume, or by weight. 
We usually handle this situation by analogy with the situation in which we 
deal with discrete objects. We choose a "unit amount” and consider that we 
are dealing with an integral multiple of this amount. But this choice of "unit 
amount" is clearly arbitrary and our quantities in general come in "odd 
lengths." While this procedure may be a practical simplification, it does not 
correspond to a satisfactory formulation of the experience pattern. 

Thus, we must deal with the concept of a magnitude and a precise 
discussion must represent an experience pattern that extends the elementary 
logical combination of objects, sets, and natural numbers. The axiomatic 
approach of Euclid does permit one to consider the notion of magnitude in a 
satisfactory way. We assume that we can compare two magnitudes of the 
same kind to see which is the larger and determine when one is an integral 
multiple of the other and the inverse relationship, i.e., an "aliquot part." 
Essentially, Euclid considers the full range of experience with these processes 
to yield a notion of relative size or ratio of two magnitudes of the same kind. 

The magnitudes that appear in Euclid’s elements include lengths, angles, 
areas, and volumes. Comparison in size is possible between such magnitudes, 
and forming multiples is also possible. However, it had also been shown that 
there exist pairs of magnitudes that are not multiples of a common unit, for 
example, the side and diagonal of a square. Thus, the arithmetic of rational 
numbers is not adequate. 

The discussion of magnitudes is based on certain definitions and on the 
"common notions," or axioms. The "common notions describe properties of 
equality. "Equality" in Euclid has a number of distinct meanings, and these 
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axiomatic properties apply to all of them. Thus, equality refers to identity, 
congruence, equality of magnitudes (as when a triangle is said to equal a 
parallelogram because they have equal area), and equality of the ratios of 
magnitudes. The first five axioms are 

1. Things equal to the same thing are equal to each other. 

2. If equals are added to equals, the wholes are equal. 

3. If equals are subtracted from equals, the remainders are equals. 

4. Things that coincide with each other, are equal to each other. 

5. The whole is greater than the part. 

In Book V, Definition 3, a ratio is defined as a 'relation in respect to size 
between two magnitudes of the same kind.” The notion of integral multiple 
is defined in Definition 2. The equality of two ratios is specified by Definition 
5. In modern symbols a : b=c : d if for every pair of integers m and n the 
multiples ma and nb are related as either ma <nb , ma = nb , or ma>nb accord¬ 
ingly as me <nd , mc=nd, or mond. In Definition 7, a : b is said to be greater 
than c : d if there is a pair of integers m, n such that ma>nb and mc^nd. 
The effect then of the definition of equality is to consider for each ratio a 
division of the pairs of integers (i.e., the fractions) into three disjoint sets and 
define equality as corresponding to the same division of the set of fractions. 
This, of course, corresponds to the modern definition of real numbers as 
essentially just this same division of the fractions into subsets. 

In classical geometry, a ratio is a relationship not an entity. Later 
mathematicians were to introduce the notion of a real number as a ratio of a 
special kind of magnitude, i.e., line segments on a straight line. Consider the 
Euclidean straight line; specify a point 0 as origin and a line segment 0P t 
as unit. Then for every point P on the line there is the “real number x” as the 
ratio of OP to OP, (this included a notion of sign as a later development). 
One can regard as intuitive that every ratio has a ratio of this type equal to it, 
i.e., they specify the same division of the fractions into sets. Thus, given a 
magnitude of any kind and the appropriate unit of the same kind, we can 
associate a size as the real number corresponding to the ratio of the magnitude 
to the unit. 

It seems natural, therefore, to find in Euclid a considerable discussion of 
the properties of natural numbers and what we now consider to be the algebra 
of ratios. Thus, most of the results in number theory that precede the unique 
factorization theorem appear and are given proofs in terms of multiples of a 
line segment. The basic result is the Euclidean algorithm for finding the great¬ 
est common divisor of two numbers. “Number theory” permits Euclid to 
show the irrationality of certain ratios and develop a theory of quadratic 
irrationalities. 
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The notion of magnitude and ratio permits one to deal with the possi¬ 
bility that the ratio of the circumference of a circle to its diameter is not 
rational and to approximate it by geometrical constructions as closely as 
desired. One can inscribe and circumscribe polygons relative to a circle for 
this purpose; this corresponds to squeezing down on the desired ratio as 
tightly as desired. Relations were obtained in classical times equivalent to our 
present formulas for the areas of triangles, regular polygons, circles, cones, 
spheres, and zones on spheres and the volumes of prisms, tetrahedra, cones, 
frustums of cones, and spheres. 

Thus, the total formulation found in Euclid with its range of geometric 
information is a stunning intellectual achievement. The range of experience 
patterns presented was a cultural inheritance basic to all subsequent civiliza¬ 
tion. Not the least important of these was the notion of an “axiomatic devel¬ 
opment” that would certainly permit independent development, but science 
was to advance by incorporating the Euclidean structure in a larger frame¬ 
work of “natural philosophy.” The arithmetic of aggregation and accounting 
still had an independent existence, but it was now linked to a much more 
powerful partner. 


4.7. Geometry and Philosophy 

The Grecian geometer considered his geometry as a true description of 
space, and his arguments are efforts of his mind to reach the truth. Aristotle 
considered the abstraction process as dealing with forms associated with the 
essential nature of the object. Plato considered the forms as representing a 
higher ideal truth than that associated with actual mundane objects. There 
is a considerable discussion of Aristotle’s views in Heath. <5> The classical 
notion of abstraction is discussed in De Wulfs (3) The System of Thomas 
Aquinas. The views of Plato on mathematics are of considerable importance 
and we quote from the translation of B. Jowett <6) from the end of Book VI 
of the Republic : 

...You are aware that students of geometry, arithmetic, and the kindred sciences 
assume the odd and the even and the figures and three kinds of angles and the like in their 
several branches of science: these are their hypotheses, which they and everybody are 
supposed to know, and therefore they do not deign to give any account of them either to 
themselves or to others: but they begin with them, and go on until they arrive at last, and in a 
consistent manner, at their conclusion.... 

...although they make use of visible forms and reason about them, they are thinking 
not of these, but of the ideals w hich they resemble: not of the figures which they draw, but 
of the absolute square, and the absolute diameter, and so on—the forms which they draw 
or make, ... are converted by them into images but they are really seeking to behold the 
things themselves which can only be seen with the eye of the mind?... 
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And of this I spoke as the intelligible, although in the search after it the soul is compelled 
to use hypotheses: not ascending to a first principle, because she is unable to rise above the 
region of hypothesis, but employing the objects of which the shadows below' are resemb¬ 
lances in their turn as images, they having in relation to the shadows and reflection of them a 
greater distinction and therefore a higher value.. . When I speak of the other division of the 
intelligible, you will understand me to speak of that other sort of knowledge which reason 
itself attains by the power of the dialectic, using the hypotheses not as first principle but 
only as hypotheses that is to say as steps and points of departure into a world which is 
above hypotheses, in order that she may soar beyond them to the first principle of the whole; 
and clinging to this and then to that which depends on this by successive steps she descends 
again without the aid of any sensible object from ideas, through ideas and in ideas she ends. 

In Book VII of the Republic the educational values of geometry from the 
intellectual point of view are stressed, and there is a minor concession relative 
to its practical value. 

It must be admitted that the logical ideal of inferences based purely on 
previously established results is not realized in Euclid, and these discrep¬ 
ancies are considered carefully in Heath. Questions of interior and exterior 
of a figure and indeed certain intersection properties are assumed from 
diagrammatic experience. Heath also discusses the relationship of Euclid’s 
‘‘definitions” with the norms of the logic of Aristotle. In modern geometry, 
these gray areas of Euclid have been replaced in many different ways by 
precise axiomatic treatments of “incidence geometries” or geometries 
involving analytic notions such as the cross product. 

The choice of the initial principles of geometry was clearly based on 
patterns of experience, and it was reasonable to assume that this abstraction 
process was a method of grasping the truth. But the use of a priori knowledge 
to base a theory was subject to a number of weaknesses, irrespective of how 
firmly this knowledge was based on our intuition, which presumably is an 
integration of past experience. The information used was selected under 
various pressures to make the resultant theory conform to preset philo¬ 
sophical, religious, economic, or political theories. This objection does not 
seem to be applicable to geometry, but in geometry the fifth postulate was an 
absolute relationship whose verification would require unattainable accur¬ 
acy. It is equivalent to the statement that the sum of the angles of a triangle is 
a straight angle. Thus, it is impossible to be sure by measurement that one 
has this rather than a slightly different case, which would correspond to a 
non-Euclidean geometry. 

Thus, principles based on previously available knowledge require 
experimental verification, and this combination corresponds to an inductive 
process for establishing the postulates of an acceptable theory. But both the 
range of available experience and the precision of the experimental verifica¬ 
tion are limited at any time, so that while a theoretic advance may produce a 
considerable expansion of understanding at a certain time, our experience 
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is that new limits will appear and we deal with repeated adjustments. Our 
experience with space and time has had this character. 


4.8. The Conic Sections 

Classical geometry is fascinating in its own right. One can obtain an 
excellent introduction from the works of Heath,* 5) Van der Waerden, <12) and 
others, and one can use translations to go deeper into the works of Archimedes 
and Apollonius. The Commentary of Proclus (7) is excellent reading. We will 
briefly discuss the conic sections, since the results on them had very important 
later effects. 

For us the geometric relations involved are most readily handled by 
means of the modern algebraic equivalents. One effective way of expressing 
geometric relations is in terms of perpendicular line segments. For example, a 
circle can be described by a relation 

y 2 = x(2r-x), 

where (x, 2 r—x) is a division of a diameter and y a perpendicular half chord 
at the division point. Such a relation was called a “symptom.” For an ellipse 
one has the symptom 

y 2 = oix(2a— x) 

for some value of a with 0<a<l. Similarly the hyperbola has a symptom 

y 2 = ax(2 a + x) 

and the parabola 

y = 2px. 

Clearly this concept is similar to our notion of the “equation of the locus” in 
Cartesian coordinates (see Van der Waerden, (12) pp. 241ff). 

Originally the conic sections were obtained by intersecting right circular 
cones by planes. A right circular cone is a cone such that a plane perpendicular 
to the axis intersects the cone in a circle (or at the vertex). A plane that 
intersects one nappe only of the cone will yield an ellipse, except when it is 
parallel to an element, in which case the intersection is a parabola. If the 
plane intersects both nappes, the intersection is a hyperbola. The symptom is 
readily obtained from this definition. 

Apollonius of Perga showed that the plane section of any circular cone 
was a plane section of a right circular cone, i.e., a conic section in the restricted 
sense. To obtain this more general result, one must generalize the notion of a 
symptom. For the symptom in the previous case, x is laid off on the major 
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axis and the direction for y is taken perpendicular to this axis. There is a 
corresponding relation in which one takes x along any central diameter and 
measures y in a fixed direction from P to the locus, but this fixed direction in 
general will not be perpendicular to the chosen diameter. 

For the ellipse, one can readily find the pairs of directions that yield 
such a symptom. Let us consider the standard symptom on the major axis, 
y 2 = ax(2a —x). It is convenient to replace x by x' = a—x so that one has the 
central form y 2 + ax' 2 = aa 2 . We drop the prime. Now consider any diameter, 
i.e., line through (0, 0). This will intersect the ellipse in the points (xj,y,), 
(—x,, — y,). Let (x 2 ,y 2 ) be chosen on the ellipse so that yiy 2 + ax,x 2 =0. It 
is an interesting exercise to show that this can always be done. 

If we express an arbitrary point (x, y) vectorially in terms of (x x , y,) and 
(* 2 > >> 2 ). we have (x, y)=s(x x , y x ) + f(x 2 , y 2 ) or 

X = SXj + fX 2 


y=sy, + ry 2 . 

The condition, y 2 + a.v 2 =aa 2 , that (x,y) be on the ellipse, becomes 
<xa 2 (s 2 +1 2 -1)+2sf(y, y 2 + ax !X 2 )=0. 

Thus, the condition becomes s 2 + f 2 = 1. 

If P,(x„ y,) is on the ellipse, then the points sfXi, y x ) for — 1 <s < 1 are 
on the chord joining P to (— x x , —y x ). For each such value of s there are two 
points (5x,.sy,) + (l -s 2 ) l/2 (x 2 ,y 2 ) on the ellipse. The line joining these two 
points is parallel to the chord / 2 joining (x 2 ,y 2 ) and (-x 2 , -y 2 ). If s 2 > 1 there 
is no ellipse point on the line through (sx^sy,) parallel to/ 2 . Fors= ± 1 there 
is just one such point on the corresponding line parallel to / 2 . These two 
latter lines are obviously limiting positions of a secant as it moves parallel to 
itself and thus are tangents. Thus, these elementary algebraic procedures, 
which involve at most quadratic relations, permit one to obtain tangents. 

For the parabola, y 2 = 2px, the alternate diameters that may be used 
are all parallel to the original axis. Let (x 0 , y 0 ) be a point on the parabola 
other than (0,0) and let 

(x, y)=(x 0 , y 0 )+s(2x 0 , y 0 ) + 1 ( 1,0) 
or 


x = (l + 2s)x 0 + f, y = (l +s)y 0 . 

If we fix t and let s vary, we get a line, through (x 0 + f,y 0 ) with slope yj2x 0 . 
The condition that (x, y) be on the parabola reduces to s 2 x 0 =/, and hence, for 
t >0 there are two points on this line. For t =0 we obtain the result that the 
corresponding line is tangent to the parabola at (x 0 , y 0 ). 
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Figure 4.2. Alternate diameters for parabola. 


The relation s 2 x 0 = t can also be expressed in terms of lengths. Let n be 
the length ofs(2x 0 ,>'o); i-e., n 2 =s 2 (4xq +yo)=s 2 (4xo + 2px 0 ). Then s Xo = f 
becomes n 2 = (4x 0 + 2p)t. We can make the following geometrical interpreta¬ 
tion (see Figure 4.2). Let P be a point on the parabola and let C be a point on 
the tangent at P. Let B be the intersection of the parabola with the line 
through C parallel to the axis. If we complete the parallelogram PCBF , we 
have PC=PB=n and CB=PP =t. Thus, (PC) 2 = (4x 0 +2p)BC=/£C 
where / does not depend on C. 

In the above, we have permitted algebraic convenience to divert us 
from the precise analogs of the geometric discussions. But one can point out 
that the ancient geometers had a facility with their type of discussion that 
certainly was as good as that associated with our algebra, and we can only 
appreciate the fascination and interest of this geometry when we can handle 
these problems with some ease. 

For further discussion of the material in this section, see Dyksterhuis,' 41 
Heath, (5) Morrow, <7) and Van der Waerden. <12> 


4.9. Parabolic Areas 

The concept of a tangent, of course, was highly significant for the future, 
although, of course, new methods not tied so tightly to a geometric interpreta¬ 
tion would be required to yield the differential calculus. On the other hand 
the integral calculus arose directly from Archimedes’ procedures and the 
original arguments are highly significant. The initial problem considered 
was determining the area bounded by the arc of a parabola and a chord. We 
show first the following: 

Lemma. (See Figure 4.3.) Let P, P 2 be a chord of a parabola. Let P 2 Q be 
tangent to the parabola, and let QE and CG be parallel to the axis of the 
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parabola. Let P 2 E be perpendicular to QE. Then 

AB_ GE 
AC~IF 2 ' 


Proof. From the argument following the discussion of conjugate axes 
for the parabola, we obtain an / such that 


(P 2 C) 2 = /CB, (P 2 Q) 2 = IQP l . 

Thus, 

CB /P 2 CVP 2 C\ = CA P 2 C 

QPx \P 2 QKP 1 Q) QPi PiQ' 


Thus, 


CB _P 2 C_P 2 G 
ca~p 2 q~p^e' 

which implies the equality of the Lemma. 


One can now show that the parabolic area BP 2 AP X is equal to one-third 
of the area of the triangle P,P 2 Q. Archimedes first gives a “heuristic argu¬ 
ment.” Let us take moments around QE. The above lemma yields CA • GE- 
AB ■ EP 2 . Thus, the moment of CA around QE equals the moment obtained 
by placing AB at P 2 . If we consider the areas as made up of parallel lines, then 
the moment of the triangle P x P 2 Q must equal that of the parabolic area 
P l BP 2 A placed at P 2 E. Thus, 

P 2 £ (P,BP 2 /4)=moment of P x P 2 Q = P x P 2 Q' \P 2 E. 
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The moment of P X P 2 Q around P 2 Q can be shown to be equal to the volume 
of a tetrahedron with base P x P 2 Q with altitude equal to P 2 E (see Figure 4.4). 

Our modern procedure for finding the area BP 2 AP X is by integration. 
Let x=GE. Then 

/ CA = QP x (P 2 E-x)/P 2 E 

and from the lemma, BA = QP x {P 2 E-x)x/P 2 E 2 . The usual summation 
definition of the integral yields 

c p 2 e 

(P x BP 2 A)= BAdx = iQP x P 2 E=^P x P 2 Q). 

Jo 

Our modern interpretation of the integral is that of a limiting process. 
On the other hand the heuristic argument can be expressed in terms of 
“infinitesmals,” that is, if we consider an integral as an infinite sum of “infini- 
tesmals." Thus, the “lines" AC and AB are to be replaced by parallelograms 
with the given line as base and height dx. But Archimedes considered a 
heuristic argument such as the above as simply indicating the relation to be 
obtained, and a satisfactory proof is given by the method of “exhaustion.” 

Archimedes’ procedure is described in Van der Waerden <12) (pp. 2161T). 
We present a variation of this argument. As in the integral approximation, we 
divide P 2 E by points G 0 = E, G x , ...,G„=P 2 . Corresponding to the point G, 
we have points B„ C,. Let G\ correspond to the midpoint of G,_iG,. 
Consider Figure 4.5 with KL tangent to the parabola at B\. We take x, = G,£, 
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&x=x,—x i - l = P 2 E/n. We can obtain by elementary procedures two form¬ 
ulas for the moment, M„ of the quadrilateral A,- t C,- t C,A,: 

Mi=i[Ci-iA t „ iX,_i + C,/4(X,)Ax+^{Cj_ ^A,. j — C,/4,)Ax 2 (m,) 

CiA)X t Ax TjiCi- 1 / 4 ,_ j — C,/4()Ax 2 . (m 2 ) 

For example, M, can be evaluated as the volume of an appropriate solid. 

If we substitute in (m,) the result from the lemma, i.e., CAx=BAP 2 E, we 
obtain 

+ B,A t ) Ax P 2 E+&Ci- C,/4,) Ax 2 . 

Now , + B,A,) Ax is the area of the quadrilateral B f _ 

contained in the parabolic region with these vertices. Thus, if a, denotes the 
area of this parabolic region, we obtain 

M i^a,P 2 E + 2 A,- j — CjAi) Ax 2 . 

If we sum over i and let M denote the moment of P,P 2 Q around P 2 E and a the 
area of the parabolic segment P,AP 2 B, we obtain 

M ^aP 2 E+^P t QAx 2 . 

Again, if we use ( m 2 ) and the lemma, we obtain 

M i =B' l A' i Ax P 2 £—— C,/t,) A.x 2 . 

But B' i A' i Ax is the area of the quadrilateral /(A,. ,/l.L, which contains the 
parabolic region and we have 

Mi > a t P 2 E-Me,. ,Ai _ j -C./l.JAx 2 . 

This yields 

M>aP 2 E—kP l QAx 2 . 

Since n can be taken arbitrarily large, the assumption that either M>aP 2 E or 
M <aP 2 E leads to a contradiction. Provided one assumes the existence of M 
and a, the above argument is logically satisfactory. A form of the integral 
calculus developed from these procedures. 



Figure 4.5. Slice of parabolic area. 
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Another argument of Archimedes for obtaining the area of a parabolic 
segment utilizes geometric series. Consider Figure 4.6: P, and P 3 are two 
points on the parabola; R is the midpoint of P 1 P 3 , and RP 2 is parallel to the 
axis of the parabola; Q,Q 2 *s tangent to the parabola at P 2 . One shows that 
the parabolic segment P,P 2 P 3 R has an area two-thirds of the parallelogram 

PlQlQlP3- 

Let a(P„ P 3 ) denote the area of the triangle PiP 2 P 3 , which is half the 
area of the parallelogram. Let /?(P,, P 3 ) denote the area of the parabolic 
segment. Thus, 

a(P 1 ,P 3 )</J(P„P 3 )<2a(P„P 3 ). 

Let A be the midpoint of P,P 2 with CA parallel to the axis of the parab¬ 
ola. The lemma implies that 


CB CP§ 1 

qJT&p I“4- 

Since CA=$Q,P l , this yields AB=\Q,P U and hence the triangle P,BP 2 
has area one-fourth that of P\Q,P 2 - Hence a(P,P 2 )=£a(PiP 3 ). We also have 

a(P„ P 3 )+a(P„ P 2 )+a(P 2 , P 3 )</?(P„ P 3 ) 

<a(P„ P 3 ) + 2a(P„ P 2 )+2a(P 2 , P 3 ) 


and hence 


a(Pi, P 3 X1 +i) <«(Pi. P 3 )«*(Pi. P 3 XI +i). 















92 


Chap. 4 


Ancient Mathematics 


A process of continued subdivision yields 

«(P,. P.)(l+j+• • • +£)</V,. /■,) <«(/>,. P.)(l +5 + • ■ • • + 

or 

* P »’ ^ " 3^) < ^ (P ‘’ p 3)< a ( p >’ P 3)(s + J^} 

Hence /?(P„ P 3 )=|a(P 1 , P 3 ). 

For further discussion of the material in this section, see Dijksterhuis, (4) 
Morrow, <7) Neugebauer, (8) Neugebauer and Satz, (9) Peet, (10) Struve, 0 l) and 
Van der Waerden. (12) 


Exercises 

4.1. Consider the problem of expressing 2/(2 k +1) as a sum of fractions with num¬ 
erator one and different denominators and choosing the expansion in which the max¬ 
imum denominator is least. 

4.2. If a quadrilateral has four sides, a , b , c, d , an ancient formula for the area is 
(a + cXb + d)/4, where a and c are opposite sides. Show that this formula yields an over¬ 
estimate in all cases except where the quadrilateral is a rectangle. 

4.3. A regular polygon is one that is equilateral and equiangular. The construc¬ 
tions of the regular triangle, quadrilateral, and hexagon are quite straightforward. The 
construction of the regular pentagon, or five-sided figure, is elementary but more 
involved. 

(a) Suppose given a length a, one constructs a triangle with angles a=?i/5, 2a, 2a 
and base a. One can then construct the pentagon by a simple compass procedure. 
(See Figure 4.7.) 



Figure 4.7. The pentagon. 
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Figure 4.8. Triangle construction for the pentagon. 

(b) The construction of such a triangle is clearly the problem of determining the 
ratio of the side y to the base x. Consider Figure 4.8. Let ABC be a triangle with angles 
a, 2a, 2a and base x = AB. Extend AB to D so that AD=y and complete the triangle 
DCB. One can show that DCB is similar to ABC and thus 

y + x = y 
y x 

This implies > , /x = ^4-^ N /5. Given x, one can construct a y in this ratio, by means of a 
suitable rectangle. 

4.4. The regular septagon (seven-sided regular polygon) is not constructive by 
ruler-and-compass construction. 

(a) Let ABCDEFG be a regular septagon (Figure 4.9). Draw the chords AE. AD 
and the chord GC intersecting AE and AD at P and Q, respectively. One can show that 
the triangle APQ is a triangle with side AQ equal to one side of the regular septagon and 
having angles a=Ti/7, 2a, and 4a. One can proceed along the following lines: 

(1) Draw AG and AC. 

(2) One shows lEAD=cl , LDAC = a, LGCA = ot. 

(3) Also LAQP=LQAC+LQCA = 2<x. 

(4) Also LGAE= 2a = LAGC\ LAPQ = Aol. 

(5) AG = AQ. 



Figure 4.9. The septagon. 
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Figure 4.10. Scptagon relations. 


(b) Suppose that given a side x we can construct a triangle with angles a, 2a, and 
4a opposite x (Figure 4.10). Then we can construct a regular septagon of side x. Let the 
sides opposite a and 2a have length y and z, respectively. One proceeds: 

(1) a = 7i/7. Wc extend PQ to G and C, with GP = z and QC-x. 

(2) Draw AG and AC. LQAC = lQCA = \lPQA - a. 

(3) lGAP = LAGP=\lAPQ — 2ol. 

(4) LG AC =4a, GA = AQ. 

(5) Pass the circle through G, A . and C. A regular septagon can be constructed 
by bisecting LAGC and doubly bisecting LG AC. Its side will be GA = AQ. 



Figure 4.11. Constructive relations. 


(c) Given APQ (Figure 4.11), proceed as in (b). 

(1) Extend PQ to C, with QC = x. 

(2) lQAC=lQCA=\lAQP = ol. 

(3) LAPC is similar to LAQP and 

x+ y z 

z y' 

(4) Extend AP to R with PR = PQ=y. 

(5) The triangle lRAQ is similar to lPAQ and thus 

y+z_x 
z ’ 


X 
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(6) We have xy+y 2 = z 2 and yz + z 2 = x 2 . Eliminating y yields (forx=l) 

z 3 + 2z 2 —z —1 =0. 

(7) The equation (6) on z is irreducible in the rationals. Hence, its group is of 
order 3 or 6, and hence the root cannot be constructed by ruler and 
compass. 

4.5. Consider the generalization of the preceding discussion to regular polygons 
of sides 11 and 13. 

4.6. The construction of the regular solids is given by Euclid in Book XIII. An 
analogous problem that provides considerable insight into the geometric relations 
involved is to determine a plane layout that by suitable cutting and creasing can be 
assembled into the surface of the solid. Duplicate faces are provided for pasting the 
figure together. Figure 4.12 illustrates this for the tetrahedron. Figure 4.13 indicates 
an approach to this problem for the dodecahedron. 



Figure 4.12. Plane layout for tetrahedron with overlap. 



Figure 4.13. Layout for dodecahedron. 
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4.7. Show that if.x isthelengthofanedgeofthedodechahedronand y=^l +>/5)x, 
then one can inscribe a cube of edge y in the dodecahedron so that the vertices of the 
cube coincide with vertices of the dodecahedron. This permits one to determine the 
radius of the circumscribed sphere. 

4.8. Show that a dodecahedron can be inscribed in an icosahedron so that the 
midpoints of the faces of the icosahedron are vertices of the dodecahedron. What is the 
ratio of the edges of the figures? 

4.9. Establish the symptom for the circle. 

4.10. Obtain the symptom for an ellipse. Consider Figure 4.14. The focus of the 
ellipse is the intersection, C, of the axis, AO , of the cone with the plane, c, of the conic 
section. The axis of the ellipse is the chord, through C, perpendicular to the intersection. 
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b Oc. Let P be a point that divides the axis into segments x and 2a-x, and let y be the 
perpendicular half chord. Let m be the plane that contains AO and CP. Let Figure 4.15 
be in this plane. If we take the plane parallel to the base through P, its intersection with 
the cone is a circle and we obtain y 2 = z(h — z). One can also show 

z(h-z) r 2 

---= a<l, 

x(2a-x) V,CV 2 C 

and this yields the symptom. 

4.11. Obtain the symptom of the parabola. Consider Figure 4.16. If the plane c of the 
conic section is parallel to the element AQ , we take the plane m to contain AQ and AC. 
As before, y 2 = zr=r 2 x/VC. 



4.12. Obtain the symptom for the hyperbola. Consider Figure 4.17. This figure is 
in the plane m, containing AO and perpendicular to the plane c of the conic section. As 
before, one obtains 



= z(h-z)= 


V l CV 2 C 


x(x+F,K 2 ). 



Figure 4.17. Cross section for the hyperbola. 
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4.13. The procedure for obtaining conjugate axes for a hyperbola involves both the 
locus of the symptom y 2 — oix 2 = om 2 and the conjugate hyperbola y 2 — ax 2 = — eta 2 . 
Thus, (x, >’,) is chosen on one locus and (x 2 , >> 2 ) on the conjugate locus with y { y 2 - 
aX|X 2 =0. One can obtain the condition s 2 = f 2 + 1 and a corresponding procedure for 
obtaining tangents. 

4.14. Show that the moment of the triangle PiP 2 Q around P 2 Q equals the volume 
of a tetrahedron with base P\P 2 Q and altitude equal to P 2 E. See Figure 4.4. 

4.15. Show that ^QP\ + CA )/GE 2 is the volume of the wedge shown in Figure 4.4. 
This expression can be used to obtain the formulas (m,) and (w 2 ). 

4.16. The principle of Cavalieri states that if solids have bases in the same plane 
and if the areas of intersections with planes parallel to the base are equal, then the 
volumes are equal. Thus, in Figure 4.18 one can show that the intersections of a plane 
parallel to the base of a hemisphere and cone have a total area equal to the intersection 
with a cylinder, i.e., 

ny 2 + 7ix 2 = nr 2 , 

and this will yield the volume of the hemisphere, assuming one knows the volume of the 
cone and cylinder. This is one step in a sequence of comparisons that permits one to 
obtain volumes on an elementary basis, i.e., without using antiderivatives. Conditions 
for the equality of volumes in the case of prisms and also in the case of pyramids are 
established, and these conditions permit comparison with the rectangular parallelepiped 
by linear and planar constructions and results in the usual formulas for the prism and 
pyramid. The volume of a cone is obtainable by an exhaustion process. The principle of 
Cavalieri can be considered to be a volume axiom for this process. 



4.17. Compare the area of a spherical zone with the corresponding band on the 
circumscribed cylinder with same axis as the zone. 

4.18. Since the range of problems that can be solved by ruler-and-compass con¬ 
structions was limited, the ancient geometers invented mechanical devices to perform 
constructions equivalent to solving Gubic and higher-degree algebraic equations. One 
such problem was to find the double mean proportional between two line segments, say, 
a and d. The most straightforward geometrical representation of such a situation is 
given by the diagram of a triangle 4PT(Figure 4.19), with lines BQ , CP, and DS parallel 
to the base AP and the lines AQ , BR, and CS parallel to each other. For such a con¬ 
struction 

a AT_QT_b_BT_RT_c 
b~ BT~RT~~c~cf~^f~d' 
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Thus, if we let p = a/b = b/c = c/d , p 3 = a/d. Mechanical devices involving rods and sliding 
collars arc readily devised that realize the geometrical restraints in our figure. But if 
these are to be used in a given context, there must be a careful mechanical analysis that 
must be based on a mathematical analysis. Suppose the lengths a and d, the point P, and 
the line PT are specified. The triangle APT can then be determined by two further 
parameters. With Z_4PTand the length d given, the triangle DST is determined and so is 
the rest of the figure. Thus, the figure has two degrees of freedom. Describe mechanisms 
with varying degrees of freedom and how practical limitations reduce the degree of 
freedom. For example, input or output may have to be along certain lines. 

4.19. One famous ancient problem was trisecting an angle. A diagram that permits 
a trisection is shown in Figure 4.20. A circle is drawn and the angle to be trisected is 
realized as the central angle, 0= LAOB. One extends AO beyond the circle. One then 
constructs the desired angle LBRA by turr ; g a line BR around B until the segment 
RS equals the radius r of the circle. If AOP »s a diameter and we start from the initial 
position BP and move the intersection point R out, the segment RS will increase from 
zero. One can readily show by means of isosceles triangles and the theorem that an 
exterior angle is the sum of the alternate interior angles that 0 = 3a. Discuss mechanisms 
for bisecting and trisecting angles. 



4.20. List the propositions in Euclid that are incorrect. 

4.21. Obtain a plane layout for the icosahedron (see Exercise 4.6). 

4.22. What is the value of the dihedral angle between the faces of the various regular 
solids? For the inscribed sphere suppose a face is tangent at a pole. For the various solids 
what would be the latitude of the vertices for this face? What can one say about the 
longitude? 

4.23. An Archimedean solid is a convex polyhedron having regular polygons as 
faces, equal edges, and the configurations around the vertices all congruent. Determine 
the set of these. What about plane layouts for each? 
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5 


Transition 

and Developments 


5.1. Algebra 

It will be recalled that Sumerian mathematics involved problems for 
which a solution was given in algorithmic form. The problem was stated in 
the form of a specific situation with definite numbers, and the procedure for 
obtaining the solution was a sequence of arithmetic operations. The arith¬ 
metic character is in contrast with the “geometric algebra” of Euclid. 

Diophantus was a mathematician of Alexandria who wrote a treatise 
consisting of a sequence of essentially arithmetical problems (see Ver Eeke (30) 
or Heath* 10) ). It is suspected that this work is a development of Babylonian 
mathematics, but certainly it is more sophisticated. Let us consider one of the 
problems, i.e., 27 in Book I of the Arithmetica of Diophantus (the following 
corresponds, in general, to the modern language used in the French trans¬ 
lation of Ver Eeke; the translation in Heath follows the more cryptic original): 

To find two numbers such that their sum and product correspond to given numbers. 

It is necessary that the square of half the sum less the product be a square.... 

Suppose then that the sum of the numbers be 20 units and the product be 96 units. 

Let the difference be 2o. Then since the sum is twenty units if we divide this into two 
equal parts, each of these parts will be half of the sum or 10 units. Then if we add to one of 
the parts and subtract from the other part, one half of the difference, a , one will obtain again 
that the sum of the numbers is 20 units and the difference is la. Consequently the largest 
of the two numbers is 10 plus a and the smallest of the numbers is 10 less than a . 

It is required that the product of these numbers constitute 96 units. Their product is 
100 less c t 2 which we equate to 96 and o-2. Consequently, the larger number is 12 and the 
smaller is 8 and these numbers satisfy the requirements. 
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This problem is elementary and still well within the capabilities of 
Babylonian mathematics. Nevertheless it has distinguishing characteristics 
that represent considerable intellectual development: 

(a) The numbers are dissociated from any enumeration or measure¬ 
ments of a specific situation such as occurred in the older mathematics or 
from a geometric interpretation as- in Euclid. Instead the numbers are 
expressed as a quantity of an abstract unit, M, in the classical Greek system. 
This system is a nonplace decimal system in which different letters are used 
for units, tens, c and hundreds. Thus 2, 20, and 22 would be represented 
MjS, !s/Ik, and M/c/?, respectively. 

(b) The discussion is in terms of properties of numbers. It is a running 
prose argument that one can readily translate into equation form. Thus, it is 
equivalent to modern elementary algebra. 

(c) The numbers are rational and positive so that one must require that 
a number must be a square of a rational if one is to take a square root. Nega¬ 
tive and imaginary solutions to problems are ignored. Many such problems 
of Diophantus are equivalent to finding integers or classes of integers that 
satisfy polynomial relations, and the modern term “Diophantine equations” 
refers to such situations 

(d) One development was the use of a symbol, a, for an unknown 
quantity. Presumably a stands for (xpcOpoa , the word for number. The various 
powers of a were also indicated by symbols. The square of o is represented by 
A" for Svvctpio (power), the cube of a by K u for “jcu/foa” (cube), and the fourth, 
fifth, and sixth powers, by A U A U , A°K U , and K V K\ respectively. There was 
also a way of representing the reciprocals of these powers. 

(e) The problem is stated using the term “given numbers.” But in the 
procedure this term is replaced by specific values, and the development is the 
same relative to these as in the previous algorithmic mathematics. 

The work of Diophantus continued to have considerable influence on 
subsequent mathematics and mathematicians. For example, Fermat’s 
interest in mathematics was awakened by the edition compiled by Bachet, 
and Fermat’s famous “last theorem” was stated in the margin of this work. 
From the collapse of the Roman Empire in the West until the fall of Constan¬ 
tinople in a.d.1452, mathematics was mostly in the hands of Arabian and 
Eastern scholars who emphasized the “problem tradition” of mathematics. 
One important development was that of decimal arithmetic using a place 
notation. The standard histories of mathematics offer many fascinating 
details. 

During the Renaissance there was a tremendous European interest in 
mathematics, and this interest produced a more sophisticated and effective 
algebra. The symbols + , —, =, x, and J were introduced in the six¬ 
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teenth and early seventeenth centuries. In the work of Vieta (1591) there 
appears a logical development of algebra based on the appropriate theorems 
of Euclid. Vieta regarded algebra in which operations are on letters as 
representing a higher form of mathematics without hypotheses such as 
Proclus desired. Descartes introduced (1637) the notation x, y, and z for the 
unknown quantities. 

The cubic and quartic equations were solved by Italian mathematicians 
Of the fifteenth century (see Smith (25) ). A cubic equation would be expressed 
at that time as “cub s p; 6 reb s aeglis 20.” The superscript s refers to the un¬ 
known and the expression is shorthand for “the cube plus 6 times the thing 
itself equals 20.” The method used is applicable to a general cubic equation 
in the form x 3 + px = g, although it was stated for specific values of p and q. 
Consider the identity in u and v 

(u — i;) 3 4- 3 uv(u — v) = u 3 —v 3 . 

If we find a u and t; such that 3 uv=p % q=u 3 —v 3 , then x = u — v is a solution. 
But if we let a = u 3 , b = t; 3 , we have a—b — q and ab = p 3 /27, and we can readily 
find a and b. 

The solution of the quartic is dependent on solving an intermediate 
cubic. Suppose we wish to solve x 4 + ax 2 4-bx + c = 0. We can write this in the 
form 

(x 2 + r) 2 = sx + f 

for r = a/2, s= —b, r= — c + a 2 /4. Let us now add a quantity y to x 2 + r. Then 
(x 2 + r + y) 2 = 2yx 2 + sx + f 4- 2 ry + y 2 . 

The right-hand side is in the form 2y(x-M) 2 for A =s/Ay provided 

y 3 4- 2ry 2 4- ty —s 2 /8 = 0. 

If we solve this cubic equation, we can express our given equation in the form 

(x 2 4- r 4- y) 2 = 2 y(x 4 - A) 2 . 

We can now extract the square root of each side and obtain a quadratic in x. 

Clearly these procedures are based simply on the binomial theorem for 
the second and third powers. The formulas for arithmetic and geometric 
series were of course known. There are also formulas for sums of powers of 
integers such as 

1 4- 2 2 H-bn 2 = (n+ lX2n+ l)n/6, 

which are readily proven by induction on n. 

This European development of algebra presented arguments using an 
increasingly flexible notation in which the manipulation of equations 
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appears to have a logical significance equivalent to the arguments of geom¬ 
etry. Nevertheless, the classical approach in which numbers or the quantita¬ 
tive aspects of magnitudes have an adjectival character rather than an objec- 
tivecharacter wasstill dominant, and arguments could be considered rigorous 
only if they could be referred to Euclidean or Archimedean axioms. 

But the increased facility in algebra was itself significant. The use of the 
symptoms for the conic equations led naturally to analytic geometry and new 
ways for finding tangents to curves. The formulas in which n was permitted 
to be indefinitely large led naturally to the idea of an infinite sequence. 
These formulas also provided methods of finding areas that were a form or at 
least a precursor of the integral calculus. 

For further discussion of the material in this section, see Smith (25) 
Struik, (26 > and Ver Eeke. (30) 


if 

" |H; 



li ||< 


5.2. Non-Euclidean Geometry 

For almost twenty centuries the fifth postulate of Euclid was a challenge 
to those who believed that it was unnecessary and a consequence of the 
remaining assumptions of geometry. But this effort was climaxed by the 
recognition that the postulates of Euclid correspond to a precise analysis 
of the relationship between points and lines in what is now referred to as the 
Euclidean plane and that variations on the fifth postulate yield other, “non- 
Euclidean,” geometries. It is a remarkable justification of Euclid. 

The notion of an angle at a given vertex occurs at the very beginning of 
Euclid, as does the configuration of linear angles at a given vertex. Postulate 
4, which is critical, states that all right angles are equal. It implies immediately 
the equality of vertical angles, but this would also follow if one assumed that 
all the right angles at any one point are equal. The significance of Postulate 4 
can be illuminated by considering cases, in which right angles at different 
points are different. For example, the surface of one nappe of a cone has a 
geometry in which the configuration of lines through a given point is quite 
similar to that for a point in the Euclidean plane except for the vertex. A 
right angle at the vertex is equal to an acute angle at any other point. (A right 
angle is half a straight angle. To define straight lines for points other than the 
vertex, we would use the fact that the surface of a cone can be laid out so that 
any sufficiently small piece not containing the vertex is flat. A straight line 
through the vertex would be a configuration of two coplanar elements.) On 
the other hand, many Riemann surfaces have exceptional points at which a 
right angle is equal to a straight angle at other points. The concept of a 
tangent plane permits one to compare the angle and line configurations of 
points on quite general surfaces. 
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Postulate 5 permits one to compare point configurations at different 
points. The line joining two points can be used to orient the angular configura¬ 
tion associated with each point. The interplay of incidence relations for the 
rays originating at the points and the angles of these rays with the reference 
line is a fundamental aspect of the geometry. 

For example, one can distinguish between three kinds of geometry by 
means of the Saccheri quadrilateral. Let us start with a right angle ABC, and 
at C erect another right angle BCD , and at D erect the right angle CDA. If 
the right angles are “interior” right angles, Postulate 5 insures that DAB is a 
right angle. But one can also consistently require it to be an acute angle, 
which yields hyperbolic geometry, or one can require it to be obtuse, as in 
spherical and elliptical geometry. 

In differential geometry, the angular configuration at a point consists of 
the tangent lines of curves through the point. The usual bilinear form yields 
the cosines of angles. An important idea here is the way in which rays at one 
point are associated with rays at another. This is referred to as “parallel 
displacement.” Thus, even in this new sophisticated geometry, the basic 
analysis of Euclid is still valid. 

The process by which non-Euclidean geometries were discovered is, of 
course, well known. Its major significance is in the fact that it established that 
geometry was not an ideal abstraction of spatial experience corresponding 
to a higher form of truth. For if this were so there would not be this ambiguity, 
which can only be resolved by experiment. Mathematics provides invented 
geometrical constructions, some of which may be associated with experience 
in the form of a mathematical theory for spatial experience. 

For further discussion of the material in this section, see Carslaw <2) 
and Manning. (15) 


5.3. Geometric Developments 

The result of introducing Cartesian coordinates into geometry is familiar 
to students of mathematics. Having chosen a unit length it is natural to 
associate the possible geometric ratios with the points on the line. This yields 
the equivalent of real numbers, including negative numbers, and this was an 
important addition to algebra. The geometric interpretation of complex 
numbers was also important in obtaining acceptance (see Smith (24) ). The 
equivalence of pairs and triplets of real numbers with two- and three-dimen¬ 
sional Euclidean space meant that geometric arguments and constructions 
have equivalent algebraic equations and manipulations. When this analytic 
structure is generalized to n dimensions, it is of course a form of analysis 
expressed in a geometric terminology. 
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Classical geometry contained a considerable body of knowledge that 
was available for this transition. The use of algebra greatly expanded the 
possibilities for various geometric notions. For example, many more surfaces 
could be considered. Analytic geometry was also intimately related to the 
development of the calculus, and the notion of limits permitted general 
procedures for taking tangents and tangent planes. But perhaps the most 
basic aspect of the use of numerical models for geometry was their incorpora¬ 
tion into the “main line” of mathematics, in which arguments are based on 
set-theoretical logic and which, of course, includes the real numbers. 

The self-imposed requirement of classical mathematics that the logical 
discussion should be based on explicitly stated “first principles” and that 
only they be used was never completely fulfilled. As long as the emphasis 
was on reasoning involving diagrams these discrepancies were not often 
considered, and indeed for a long time there was not the rigorous mathe¬ 
matical discipline that would be sensitive to such logical inadequacies. But 
after analysis had been established on a set-theoretical logical basis, there 
was considerable interest in the corresponding formulation of the classical 
example of axiomatic development by distinguished mathematicians such 
as Schur and Hilbert. 

Actually, there was a considerable logical refinement in regard to various 
geometrical notions such as incidence relations and the order of points on a 
line. Consequently, it was possible to formulate not only Euclidean geometry 
on a precise basis but also a variety of other geometries by changing or 
omitting axioms. The general resurgence of interest in mathematics, which 
began in the sixteenth century, had a rather small but concomitant develop¬ 
ment in classical diagrammatic geometry, which reached a striking climax 
in the nineteenth century. 

Thus, the present total logical development called “synthetic geometry” 
greatly exceeds the classical development and has a logical structure satis¬ 
fying the same rigorous standards as any other part of modern mathematics. 
There is a tendency to consider this synthetic geometry as of no practical 
significance, but it is not unusual to find interesting and fascinating applica¬ 
tions. But the type of logical reasoning that structures modern synthetic 
geometry is highly significant for applications. The normal conceptual 
procedure is in diagrammatic forms and thus has the appeal and imaginative 
capabilities of traditional geometry. But these constructions and deductions 
must be disciplined so that each step can be precisely analyzed into the basic 
incidence and order axioms. 

In the set-theoretical logical form, a geometry is considered to be a con¬ 
glomerate of sets of various kinds of objects with relations between objects 
in different sets or even in the same set. Thus, for an incidence geometry in a 
plane one has two kinds of objects—points and lines—and the relations of 
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“point on a line” and “order of points on a line.” There would also be a 
number of constructive axioms stating the existence of examples of such 
relationships under various hypotheses. 

For such conceptual conglomerates there are many metaprocedures, 
such as monomorphisms, which are one-to-one correspondences of one 
conglomerate into another that preserve the internal structure, or isomor¬ 
phisms, which are monomorphisms onto. A geometry can be shown to be 
free from contradictions if there is an “analytic model,” i.e., an isomorphic 
image consisting of numerical concepts. This assumes that the number 
concepts themselves do not imply a contradiction. One may also consider 
monomorphic images in Euclidean geometry that one may be willing to 
accept as consistent. Models can also be used to show the independence of a 
specific axiom from others in a set. The concept of numerical limit in relation 
to analytic models permits one to establish in a consistent way the notion of 
tangency and a number of other limiting relations between analytic models. 

For further discussion of the material in this section, see Daus, (4) 
Forder, (8),(9) Hilbert, (12) and Lines.° 4) 


5.4. Geometry and Group Theory 

It will be recalled that Proclus described the “generality” inherent in 
geometrical reasoning as due to the assumption that reasoning with a typical 
example would be valid for all examples of the same type. In the argument 
only properties explicitly connected with the type are to be used. But if we 
are looking at geometry as a whole, this raises the question as to what proper¬ 
ties are available to form “types.” For rectilinear figures the basic properties 
are equality of angles or equality of line segments. When figures have certain 
combinations of both these properties we have congruence, and when only 
angular properties appear we have similarity. 

This suggests that one looks more closely at the notions of equality of 
angles and line segments and congruence. Two figures are said to be con¬ 
gruent if there is equality of corresponding angles and line segments. But 
equality of angles or line segments would usually be interpreted to mean 
that if one “moved” one figure to coincide in part with another figure, the 
part in question would completely coincide. But such a “motion” can be 
considered as a construction of the moved figure in a new position. This con¬ 
struction produces for any given point in the plane a corresponding “moved” 
point and thus is a “point transformation.” The reader will immediately 
recognize that the set of these point transformations constitute a group for the 
congruence case, and this is also true for similarity situations. The equality 
properties are invariants of the figures under these groups of point trans- 
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formations. Thus, a geometry is associated with a group of point transforma¬ 
tions, and the theorems become statements about invariance. 

In the Euclidean plane the group of motions associated with congruence 
contains the parallel translations, the rotations around points, and the 
inversions in lines. For similarity one must add the homothetic transforma¬ 
tions, which are the expansions or contractions in a fixed ratio from a given 
point. There is, however, an even larger group of transformations, which 
preserves the angular configuration at each point in the sense of preserving 
the angles between tangent lines. This group of conformal transformations 
includes the inversions in circles and is explored in terms of mappings of the 
complex plane. 

We can reverse this process and start with a group of point transforma¬ 
tions and concern ourselves with the invariants and the corresponding infer¬ 
ences we could make relative to figures. The group of point transformations 
can be expressed either analytically or by synthetic constructions. This 
represents a modern metamorphism of geometry that has become highly 
significant in quantum mechanics and as the theory of Lie groups in mathe¬ 
matics. 

The notion of a group of transformations under which certain properties 
are invariant can be associated with the mathematical theory of a science. 
If one is willing to assume on the basis of experimental evidence that such a 
group exists, one has a much more powerful mathematical machinery for the 
purpose of inference. Actually, in physics, for instance, one seems to have a 
complex of groups that of course corresponds to a complex of geometries. 

How does one infer the existence and specifications of such a group or 
groups on the basis of experimental evidence? One might expect that this is 
some general inductive process of inferring the laws of a theory from experi¬ 
mental evidence. However, what actually happened in one instance in 
physics is instructive. Originally Newtonian physics was set up on the 
basis of the “Galilean group” in space-time. A transformation in the Galilean 
group is determined by a change of a system of coordinates in space-time. 
The relevant system of coordinates consists of a Cartesian system in space 
and an additional time coordinate axis. Changes includes the Euclidean 
changes in space for such systems with the time coordinate unchanged, as 
well as a subgroup of “uniform” motion changes given by 

x' = x-vt y / = y, z , = z, t = t y 

where v is a group parameter. The full Galilean group is generated by the 
transformations mentioned. 

For a considerable range of mechanical phenomena this formulation of 
Newtonian physics on the basis of the Galilean group was satisfactory. 
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However, Maxwell’s equations for electromagnetic phenomena are not 
invariant under the Galilean group, but under the Lorentz group in which 
the above-mentioned one-parameter group is replaced by 

^ = [1 — (y/c) 2 ] -1/2 (x—Df), y=y, t=z, 
r' = [1 —(p/c) 2 ] " 1/2 (f —vx/c 2 ) 


and the spatial transformations with t unchanged remain the same. 

Thus, the Newtonian laws of mechanics and the phenomena of electro¬ 
magnetism are associated with different space-time geometries, and this can¬ 
not be right, since it means we can appear to have changes in one set of 
phenomena but not in the other by a mere change of coordinate systems. 
Electromagnetism has had an extensive experimental verification of its 
Lorentz invariance. The most effective test of Newtonian mechanics is in the 
detailed motion of the solar system. In general the velocities in the solar 
system are such that in the range that is experimentally available the differ¬ 
ence between corresponding transformations in the two groups is practically 
indistinguishable. However, there is a small but measurable discrepancy 
between the Newtonian prediction for the motion of the planet Mercury and 
the actual motion. But if one uses a mechanics invariant under the Lorentz 
group, this discrepancy disappears. Thus, the Lorentz group geometry should 
be used for both sets of phenomena. 

Thus the choice of groups was a matter of adjusting to experience not to 
a large-scale inductive process. The search for an appropriate group is 
clearly equivalent to searching for the correct axioms of a geometry. A 
geometry may have a simpler geometry as a limit when certain parameters 
approach zero. The expansion of an area of experience may require one to go 
from a simpler to a more complex geometry, represented by a different 
choice of a group of transformations or modification of a set of axioms. This 
is not a matter of arranging a large number of facts into a pattern discovered 
inductively, but rather adjusting a pattern to conform to a wider range of 
experience. 

For further discussion of the material in this section, see Bierberbach, (1) 
Daus, (4) and Einstein et al. {1) 


5.5. Arithmetic 

Computational procedures were referred to in classical times as “log¬ 
istics" as distinct from the more intellectual consideration of the properties 
of numbers, which was referred to as “arithmetic.” Arithmetic in the form of 
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computation is associated with two distinct areas of application. One of these 
is scientific or technical, and in the ancient world this was mainly astronomy. 
Astronomical tables and calculations were performed in a hexagesimal system 
in which the numbers in specific places were expressed in the Greek decimal 
system, that is, with letters for different multiples of ten and for different 
digits. 

But arithmetic is also part of trade, tax collecting, and military logistics, 
and it is more than likely that these applications influenced its development. 
The use of hexagesimal fractions was probably originally associated with 
coinage, and it is also possible that the place system originated with counting 
boards. 

Counting boards were intended to perform the arithmetic associated 
with business or government transactions. Numbers were represented by 
means of markers placed in rows ruled on the board. Each row corresponded 
to a place in a place system. The number of markers in a row could correspond 
to the digit in this place, or the markers could have symbols on them to 
indicate their value, or colored markers of different values could be used. The 
arithmetic operations were performed in a manner similar to those on the 
abacus (see Moon (17) ). 

Counting boards were widespread until paper became inexpensive 
enough to permit its use for computation. The word “calculus” means a 
little stone, i.e., a marker. A variation of the counting board was a sandbox 
with divisions in which marks could be made in the sand with a finger. The 
abacus in its modern form was a further development. 

The development of the decimal place system for representing integers, 
with a digit zero, was due to Hindu mathematicians and was introduced to the 
West through Arabian channels. Decimal integers could be used with frac¬ 
tions such as twelfths in a wide range of applications, but decimalization was 
climaxed by the use of decimal fractions in the work of Stevin (1585), and this 
permitted the development of our usual arithmetic in the next 150 years. 

A calculating machine was designed by Pascal, but essential improve¬ 
ments were due to Leibnitz, whose logical design was used in calculators for 
many years. Rotatory calculators were of great practical significance until 
the 1950s and used a completely decimal notation. But advances in electronic 
circuitry led first to large-scale data processing, which dominated business 
and government, and then to small devices of remarkable capability. Elec¬ 
tronic circuitry is better adapted to use the binary or radix two system rather 
than the decimal system. Boolean algebra is readily represented, as are 
two-state circuits. 

For further discussion of the material in this section, see Moon, <17) 
Murray, (19) and Struik. <26) 
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5.6. The Celestial Sphere 

Astronomy and mathematics interacted on each other in many sig¬ 
nificant ways, and our modern exact sciences are lineal descendants of this 
interaction. For most applied mathematicians even the more elementary 
technical relations between these subjects are important, and these relations 
are readily available in such books as that by Smart ( . 23) Students interested 
in applied mathematics will certainly find in this area valuable and fascinating 
ways to enhance their facility with three-dimensional geometry. However, our 
immediate concern will be with certain aspects involved in the development of 
mathematical understanding. 

The distances of the fixed stars from the sun are enormous compared 
with the width of the earth’s orbit. The angular displacement of even the 
nearest fixed stars is about 1.5" of arc and can be observed only by telescopic 
photography. Thus, the stars have fixed angular relations when observed 
from earth, and these angular distances seem to be constant in time and space. 
Thus, one may consider the stars to be on a “celestial sphere” of indefinitely 
large radius. 

Due to the rotation of the earth, this whole sphere appears to revolve 
in a little less than a day. After the twilight that follows sunset, a certain part 
of this sphere is visible. This visible part turns westward, with a new portion 
rising in the east. This motion is a rotation around a fixed line parallel to the 
earth's axis. The points where this axis intersects the celestial sphere are the 
poles. In the northern hemisphere we see only one—the North Pole. The 
North Pole is fixed relative to us as well as on the celestial sphere. 

In addition to this fixed axial direction, a plumb line will determine a 
direction of “straight up” or “straight down.” We can also think of a sphere, 
fixed relative to us, and coincident with the celestial sphere. These two fixed 
directions determine a plane called the plane of the meridian. This plane 
intersects the fixed sphere in a great circle called the meridian. The celestial 
sphere rotates past this great circle westward. 

The North Pole corresponds both to a point on the celestial sphere and 
a point on the fixed sphere. Thus, the great circles for which it is the pole 
coincide and these are called the equator. Of course, the equator rotates with 
the celestial sphere when it is considered on the celestial sphere so that its 
intersection with the meridian moves eastward on the celestial sphere. 
Another great circle on the fixed sphere is the horizon, which has the vertical 
direction as its pole. The horizon is mathematically defined; it is not the 
apparent horizon of observation. 

Gravity permits one to readily determine the vertical direction and the 
horizontal plane. The direction of the North Pole can also be readily deter- 
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mined. These determine a spherical coordinate system in which a longitude 
variable, the “azimuth,” is measured along the horizontal circle from the 
northern meridian point. The latitude variable is called the elevation. 
Azimuth as designated as either “east” or “west.” The zenith is the point 
immediately overhead, and the “zenith distance” of a point is the complement 
of the elevation. With limited resources, these angles—azimuth, elevation, 
and zenith distance—are most readily measured. 

The light of the sun blanks out the stars and the celestial sphere when the 
sun rises, but in a single day it moves across the sky from east to west on a 
path similar to that of the stars on the celestial sphere. Thus, it is natural to 
assume that the motion of the sun is part of the motion of the celestial sphere, 
and this will agree with casual observation over a few days. But more careful 
observation will show that the part of the sphere visible at night changes 
somewhat from night to night as if the celestial sphere were turning faster 
around the earth than the sun. 

One can assume that the sun partakes of the motion of the celestial 
sphere but moves backward on it one complete revolution a year. Because 
of this relative motion, the part of the celestial sphere visible at night at 
different times in the year changes, so that one can plot the stars that appear 
on this entire globe, except for a cap around the South Pole, which is always 
below the horizon. The size of this unseen cap depends on latitude. On this 
completed globe the sun follows a fixed path, which appears at first to be the 
same from year to year. 

This apparent motion of the sun on the celestial sphere is due to the 
orbital motion of the earth. The mean distance of the sun is about 23,000 
times the radius of the earth, so that to an observer at the North Pole and one 
at the equator, the line of sight to the sun would be parallel to within 8" of arc. 
Thus, the sun appears at any instant to have a position on the celestial sphere 
that is practically independent of the position of the observer. The vector 
joining the earth to the sun always moves in the orbital plane, and this vector, 
of course, corresponds to the apparent direction of the sun from the earth. 
Thus, the apparent motion of the sun is essentially on a plane fixed in regard 
to the observer. The intersection of this plane with the celestial sphere is the 
great circle called the ecliptic, the path of the sun on the celestial sphere. 

The ecliptic intersects the equator in two points, one of which is associ¬ 
ated with the constellation of Aries. To an earthbound observer the ecliptic 
appears to revolve with the celestial sphere, so that this point of Aries, which 
always moves on the fixed great circle the equator, corresponds to the position 
of a hand on a 24-hour clock. If we take the zero hour to be the instant when 
this point crosses the visible part of the meridian, the sidereal time of day is 
indicated by the westward arc on the equator from the meridian to the present 
position of the point in Aries. 
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On the other hand, this point of Aries is also a fixed reference point on 
the equator of the celestial sphere. Thus, on the celestial sphere itself, one can 
set up a spherical coordinate system with one angular coordinate similar to 
longitude measured eastward along the equator from this point in Aries. This 
coordinate is called the right ascension of a star position. The equivalent of 
latitude is called the declination and is positive toward the North Pole. This 
system of coordinates is called “equatorial.” There is a similar system called 
“zodiacal” with “celestial longitude” measured eastward along the ecliptic 
from the point in Aries and “celestial latitude” positive on the North-Pole- 
side of the ecliptic. 

For further discussion of the material in this section, see Smart. (24) 


5.7. The Motion of the Sun 

For people living in the temperate zone, probably the most obvious 
celestial phenomena is the seasonal variation in the length of day and the 
related position of the sun during the day. These depend directly on the 
declination of the sun. In its motion along the ecliptic, the sun is above the 
equator during spring and summer and below it during the fall and winter. 
The dependence of the length of day on the sun’s declination, <5, is discussed 
in Smart (24) (Chapters II and III). The sun has celestial latitude 0, and hence 
sin S = sin e sin A, where e^23°2T and A is the celestial longitude of the sun 
and should therefore be considered as a function of the time (see Exercise 
5.19). 

The time between two successive passages of the sun through the point 
of Aries is practically constant whether measured in terms of the rotation of 
the earth or by frequency procedures based on atomic phenomena. Thus, A, 
the sun’s longitude, is an increasing function of the time that differs from a 
linear function nt + e 0 in a periodic way, i.e., il/(t)=A—nt—e 0 will be zero at 
f = 0, if £q is the corresponding value of A and at t = T 2T ,..., where T is the 
number of time units in a year. Thus, A changes by 2 n as t changes by T, and 
n = 2n/T. 

Now consider a polar coordinate system (r, 0) for the orbital plane of the 
earth centered at the sun and with reference line joining the sun to Aries. 
One can readily see (Figure 5.1) that 0 = A + 7u (mod 2n). If a) is the angular 
coordinate of perihelion, the orbital ellipse is given by 

r _ a(l-g 2 ) 

1 +e cos (6— <D) ’ 

where e is the eccentricity of the ellipse. Then Kepler’s second law becomes 
r 2 (dd/dt)=h=a 2 n(l —e 2 ) 112 (Smart,' 24 ’ p. 100). Let v = 0— <3, X = v + a>±n, 
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Figure 5.1. The earth's orbit. 


where the sign is chosen so that 0^A^27r. If t is time of perihelion, then 
v=0 at t = l. Then 

dv/dt _ n 

(1 + ecosv) 2 (1—e 2 ) 3/2 ’ 

This equation can be integrated (see Exercise 5.20) into the form 

v + <t>(v)=n(t —t\ (2) 

where 

0(v)= Yj ( —1)"[2(1 — e 2 ) 1/2 + l/n]e* n sin nv (3) 

n = 1 

for e*=e/[( 1 —e 2 ) 1,2 + 1]. Now for the earth’s orbit, e=0.016726 (1960) and 
e* = 0.0083636, and for the given accuracy we have 

<£ e (v)= -0.033452 sin v+0.000210 sin 2v. (4) 

For the major visible planets in the solar system the eccentricity of the 
orbits are relatively small numbers. Mercury has an e of about 0.2, and for 
Mars e = 0.093368 (Smart, 11241 p. 422) and 

0 M (v)= -0.186736 sin v+0.006547 sin 2v-0.000272 sin 3v 

+0.0000119 sin 4v-0.0000005 sin 5v. (5) 

(We have retained an extra figure to minimize roundoff.) 

Because of the smallness of the coefficients in Equations (4) and (5), it 
is relatively easy to transform Equation (2) into the form 

v=n(f — F)+¥(w(r — T)). (6) 

See Exercise 5.21 and 5.22. In particular we have 

»F E (x)=0.033451 sin x+0.00349 sin 2x +0.000003 sin 3x 


(7) 
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and 

'P M (x)=0.186533 sin x+0.010863 sin 2x + 0.000877 sin 3x 

+ 0.000081 sin 4x+0.000007 sin 4x. (8) 

Thus 

A=n(f—t)+¥(n(r —7))+w ± Jt. (9) 

The table in Smart* 24 * (p. 422) gives <5 and the value of 0, e 0 at r = 0. This 
permits one to find nf, since 

—nt + ip(—ni)+(o=e 0 , or nt + il/(nt)=6i—E 0 . (10) 

In view of the relation of Equations (2) and (6), this yields 

nt=d)—e 0 + </>(d>—£ 0 ) = t 0 . (11) 

Thus, if 

M = nt — r 0 +c5 0 = nt+e 0 — <p((D— e 0 ), (12) 

A=M + ¥(M-d> 0 )±n. (13) 

M is a “mean” longitude with a uniform rate of change, and the “true” 
longitude A differs from M by ± 7t and a periodic function ¥, which has values 
less than 0.033454 radians, or 2°. Note that ¥ is always less than the change 
in M for 2 days. 

For further discussion of the material in this section, see Smart.* 24 * 


5.8. Synodic Periods 

Except for the moon, the objects in the solar system are at extremely 
large distances relative to distances available on earth. Venus comes closest 
to the earth, but if one had a baseline of 1000 kilometers between two 
observers, their two lines of sight toward Venus would still be parallel to with¬ 
in 5" of arc at the nearest approach. Both the radius of the earth and the 
distance to the moon were measured in ancient times by procedures based 
on simple angle-measuring instruments, and the answers were probably 
correct to within a few percentage points. But the distances to the sun and the 
planets were either recognized as unavailable or greatly underestimated. 

Thus, direct geometric procedures can only yield angular information 
for the planets. This information in general deals with periodic phenomena 
or nearly periodic phenomena. For example, the outer planets are most 
readily observed when they are precisely opposite the sun on the celestial 
sphere. The orbital planes of these planets are inclined at about 1 or 2 degrees 
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to the plane of the ecliptic. Thus, the periodic motion of the planet and that 
of the earth will cause the sun, the earth, and the planet to align in longitude 
at intervals that can be predicted approximately on the basis of a long-time 
average. Let M E and M P be the M of Equation (12) for the earth and the 
planet, and suppose that at time f = 0 both have the value m 0 . Then M E = 
(2 jt/ T B )t + m 0 and M P =(2n/T P )t+m 0 . Since M E changes faster than M P , the 
next “mean” alignment will occur when M E — M P =2n. If this corresponds to 
a time interval t, then 1/T E = 1/T P + 1/r. If we ignore the difference between 
the orbital planes, what is really desired is that 2 E and 2 P coincide up to 2 n, 
and the equation becomes 

M e + 'F e (M e -(d) =M P +4VMP -(d) + 2n. 

This will require us to correct the t value for each individual conjunction. 
But the relative orbital position of the earth at conjunction determines the 
relative orbital position of the planet. This implies that the correction is a 
function of the time of year. 

In general, if only nontelescopic instruments are available, immediate 
angular measurements may not be very accurate, but if careful records are 
kept, long-time averages may be much more precise, and the phenomenon 
lends itself to empirical formulation. The modern form of such “empirical 
formulation” would undoubtedly be a Fourier series. The predictions for the 

I moon are probably the most difficult. 

It) IN' 

| 5.9. Babylonian Tables 

* IS 

The integration of this experience into a consistent arithmetical pro¬ 
cedure for prediction was an excellent intellectual accomplishment. Pres¬ 
umably this occurred in Babylonia before 300 B.c. (see Neugebauer' 20 ’). The 
celestial sphere was introduced, and a zone, the “zodiac,” around the ecliptic 
was divided into twelve equal sectors, corresponding to the constellations or 
“signs of the zodiac”—Aries, Taurus, Gemini, etc. Thus, Aries corresponds to 
longitude 0°-30°, Taurus to 30°-6(P, etc. Babylonian arithmetic was quite 
adequate to yield prediction tables for years ahead to correspond to observ¬ 
able phenomena. Long-time averages were represented quite precisely, and 
angular phenomena were represented in a manner quite adequate for 
observation. 

Babylonian astronomy is described by Neugebauer 120 * (Chapter V, 
p. 97). The characteristics described above are well illustrated by his examples 
(p. 110) of the changes in longitude of the sun in monthly intervals. The first 
column is a succession of dates corresponding to equal time intervals of a 
mean synodic month. The second column is the change in the longitude of the 
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sun since the previous date, and the third column is the actual longitude of the 
sun expressed by degrees within a constellation. Let (AA) f , i = 0,1,..., 12, 
denote the values in the second column. Thus, this table gives the position 
of the sun relative to a lunar calendar. 

The (Aa), values vary from month to month but are reasonably close to 
an average value The (AA), values vary between values M and m along a 
“sawtooth” curve, so that except where the sawtooth changes slope, suc¬ 
cessive tabulated values of (AA), are obtained by adding or subtracting 
5 = 18'. The period of the sawtooth is one year. The entries, AA, are given in 
degrees, minutes, seconds, and ^ths of a second. 

To construct this table, one must know or choose three quantities. One 
of these is v, the number of synodic months in a year, the result of a long-time 
averaging process. If is expressed in degrees, /i=|{M + m)=360°/v. The 
total variation of the sawtooth function over a year is 2(Af —m), and if s is the 
slope of the sawtooth, sv = 2(M—m). Thus, v and the choice of the slope s 
determines M and m. The remaining quantity is a time-phase quantity that 
determines where the first value of AA in the table is to be taken on the 
sawtooth. Variations in the choice of slope or phase will introduce relatively 
small errors in individual values. But an error in the value of v or ^ would 
produce an error that would increase with time. 

It is interesting to compare the values used with modern values. The 
value of n given is 29.10537°. The modern value is 29.10675° and the differ¬ 
ence 0.00138° is approximately 5' of arc. Similarly the maximum value M 
is 30.03306° and the corresponding modern value is 30.08965° with a differ¬ 
ence corresponding to 3.4'. Similarly the m used is 28.17769°, the calculated 
minimum is 28.16274°, with a difference of 0.9 7 of arc. For the modern case, 
a slope of 18.7 would have been chosen. These values indicate maximum 
errors of a few minutes. If we suppose that the sawtooth should have been 
replaced by a sine curve, the maximum difference between sawtooth and 
sine would be 0.39° or 23.4'. This would seem to indicate that while long-range 
averages may have percentage accuracies of 0.005%, the angular measure¬ 
ments were probably not sensitive to much less than half a degree. The 
Babylonians also had tables for the motion of the moon and for planetary 
phenomena. 

For further discussion of the material in this section, see Neugebauer. (20) 


5.10. Geometric Formulations 

Greek mathematicians including Apollonius and Hipparchus set up 
geometric models for solar system motions using exocentrics and epicycles 
that gave a better qualitative description. The Ptolemaic system is described 
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in Neugebauer (20) (Appendix 1, p. 191), and there are many references in 
Van der Waerden. (29) If the orbital motions of the planets were circular and 
consequently uniform, then the epicyclical description of the angular motions 
referred to a fixed earth would be precisely correct. Under this assumption 
the sun would appear to uniformly circle the earth and the inner planets would 
circle around the sun. For the outer planets this is an apparent motion that is 
indistinguishable from the actual motion. For example (see Figure 5.2), if one 
represents the displacement of Mars relative to the sun by a vector y(t) and 
that of the earth relative to the sun by a vector x(r), then the angular motion of 
vector ME is given by y—x. This is identical with a motion obtained as 
follows. Let C co mple te the parallelogram with vertices ESM (Figure 5.2). 
Then t r £ = y and MC = x and C revolves around the earth the same way as 
M does around S, and M revolves around C the way E revolves around S. 
Thus, ME can be considered as due to the rotation of M given by x about 
moving C given by y. Thus, if the orbital motions were uniform, an epicyclical 
description would be valid. 

But the orbital motions are not quite uniform even though the eccen¬ 
tricities of the major planets are not large numbers. One can therefore try to 
“compound” the epicyclical models by describing the nonuniform orbital 
motions by epicycloids. One can begin with the apparent motion of the sun in 
longitude, i.e., the equivalent of the earth’s orbital motion, which appears in 
the epicyclical description of every planet. 

Let us obtain a first-order fit to a Kepler orbital motion by means of an 
epicycle. Consider Figure 5.3 and suppose P moves around S in an epicyclical 
fashion, with CS (the “referrant”) moving with radius R and with PC (the 
eccentric) with radius r = pR , say, with p < 1. CS has rotated an amount t and 
PC an amount a relative to SC, and t and a are both linear functions of the 
time. For the earth orbit, we will have x — a equal to a constant which we can 
take as zero. Then 0, the heliocentric longitude, equals t + a, where a = 
arctan[r sin <r/(R — R cos a)] = arctan[p sin cr/( 1 —p cos <x)]. Since pel, a 



Figure 5.2. Apparent epicyclic motion. 


Sec. 5.10 • Geometric Formulations 


119 



Figure 5.3. Epicyclic angles. 


can be expressed as a Fourier series in <x, i.e., 

00 p n 

0 =t+ ]jT — sin na. 

«=i n 

We can, of course, consider this expression as equivalent to 

0 = M + v F(Af-d>) or M'+dj+'FfM'). 

We have seen (p. 114) that in general we can take *¥ in the form 

v F(x)=a 1 sin M' + a 2 sin 2M'H- \-a k sin kM\ 

where the a, values get progressively smaller. For small eccentricities a x is 
about 2e, and we can take p = a x =2e. Then the discrepancies between the 
Keplerian orbit corresponding to NK and the above epicyclical model with 
p = a x will appear only in the higher frequencies. (Notice that we are compar¬ 
ing only angular values. The variation in the distance PS for the epicyclical 
model used here is twice the variation of PS in the Keplerian orbit, and this 
can lead to observable differences in the apparent diameter of the sun and 
planets and in the relative brightness.) 

For the apparent motion of the sun relative to the earth, a x =0.033451, p 
is 0.033451, and thus the epicyclical expression is 

0 = t + 0.033451 sin a + 0.000560 sin 2a 4*0.000012 sin 3<r. 

The Keplerian second term has value 0.000349 [see Equation (7) above] and 
the difference 0.000212 radians correspond to 44". In one day M changes by 
0.017203, so that the difference between the two models is about ^ of the 
change in longitude in a day. 

This discrepancy, of course, depends on the eccentricity. If we look at the 
table on p. 422 of Smart, (24) we see that of classically known planets Mercury 
has e — 0.205627, but Mercury would be hard to observe. For the rest Mars 
has eccentricity 0.093368, which is seven or eight times as large as that of the 
earth, and of course, the elliptical character of the orbit of Mars was the first 
to be established by Kepler. 

The “compounding” of epicyclical motions is, of course, very complex, 
and Copernicus recognized that using the sun as the center simplified the 
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description, since otherwise its motion appeared as an element in the motion 
of every planet. Actually what Copernicus pointed out was that there was a far 
more reasonable description of the cosmos than the fixed-earth and earth- 
centered classical model and in this more reasonable model the larger sun 
was the center around which the planets moved, including a daily rotating 
earth. One aspect of the new model was that the previous epicyclical informa¬ 
tion now indicated approximately the ratio of the orbital major axis of a 
planet to that of the earth. Thus, the Copernican model was based on two 
scales. One scale applied to the earth-moon system and could be considered 
to refer to a reasonably correct estimate of the radius of the earth. The larger 
scale, of course, involved the orbit of a planet and was relatively correct in 
terms of the axis of earth’s orbit, but the ratio of the two scales was consider¬ 
ably underestimated. Thus, Copernicus considered the ratio of the major 
axis of the earth’s orbit to its radius to be 1142, while it is actually about 
23,500. This situation can be described as a reasonable interpretation of the 
angular information of limited accuracy. 

For further discussion of the material in this section, see Neugebauer (20) 
and Van der Waerden. (29) 


5.11. Astronomical Experience in Terms of Accuracy 

The development of astronomy is a clear-cut example of interacting 
intellectual and experimental procedures. In the initial stages the above 
arithmetical and geometric procedures were adequate to correspond to 
relatively accurate long-time averages and presumably rather crude angular 
measurements. Direct observations of angles were based on instruments 
called “quadrants” or devices that set up isosceles or right triangles so that 
the angles could be determined by trigonometric tables. Hand-held instru¬ 
ments of this type were probably not more accurate than a tenth of a degree. 
For angles in a fixed plane, usually the meridian, it was probably possible to 
set up larger instruments with finer angular resolution, but there may not 
have been a corresponding increase in accuracy. To obtain a correct order of 
magnitude estimate of the ratio of the two scales of the Copernican model by 
direct geometric methods, accuracies of a fraction of a second between 
cooperating observers at a considerable distance apart would be required. 

The accuracy of angular measurements can be inferred to a certain 
extent from examples quoted in Dreyer. In measuring the radius of the earth 
Eratosthenes concluded that the difference in latitude between Alexandria 
and Syrene corresponds to ^ of a circular circumference. Of course, no 
precision is indicated by this statement, but the corresponding degree 
expression is 7° 12'. The correct value is 7°6.7', and thus the error is about a 
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tenth of a degree. Ptolemy gave the latitude of Alexandria as 30° 58', while 
the true latitude 31 11.7', or a difference of two-tenths of a degree. Similarly, 
Eratosthenes’ measurement of the latitude of Syrene was 23°51.3' and the 
true latitude was 23°43.3', a difference of 8 '. Tycho Brahe corrects an error of 
3 ' in the work of Copernicus relative to the latitude of Ermsland. These are 
all measurements associated with angles along a meridian and furthermore 
can be taken near the zenith, where refraction is negligible. Thus, the limit of 
accuracy was about a tenth of a degree and the associated model of Coper¬ 
nicus is completely appropriate for this experience. The model of Copernicus 
was not generally accepted for a century after it was proposed, but it was 
accepted by Kepler, and this proved to be decisive. 

The Dark Ages saw a rather complete intellectual deterioration in 
Western Europe, but the art of instrumentation continued along classical 
lines in Byzantium and among Arabic astronomers. But from 1300 a.d. on, 
there was a revival of interest in astronomy in Europe, which, however, was 
plagued by theoretical inconsistencies and lack of effective observation (see 
Dreyer <5) ). The work of Copernicus is essentially intellectual and his observa¬ 
tions are supplements to his development of the model. The Danish astron¬ 
omer Tycho Brahe realized the importance of systematic observations of the 
best possible accuracy and set up an observatory with instruments capable of 
consistent accuracy of the order of a quarter of a minute. These preoptical 
instruments and the consistent system of observations using them permitted 
Kepler to correct the epicyclic models into the geometrically sound elliptical 
orbits and to infer his famous “laws” with their dynamic content that was 
recognized by Newton. Kepler’s analysis uses geometric arguments. 

The instruments of Tycho Brahe and his own account of these and his 
astronomical experience is fascinating (Raeder et u/. <23) ). He describes the 
various instruments and the procedures for using them and makes very 
definite statements about the precision and accuracy of the measurements. 

There is one group of instruments for measuring azimuth and elevation 
or zenith distance. Certain of them consisted of a “quadrant” rotating around 
a vertical axis. The zenith distance of an object was measured by sighting along 
the arm of the quadrant. Azimuth measurements were obtained by presetting 
the azimuth reading and taking the elevation reading at the time the preset 
azimuth was obtained and noting the time. The quadrants used had radii of 
from 155 cm to 194 cm. This corresponds to lengths of from 2.7 to 3.386 cm 
for a degree and one minute corresponds to about half a millimeter. There 
was a special zigzag scale to permit a fine resolution. Accuracies of 3 , or £ of 
a minute were claimed. This must be very close to the limit of visual discern¬ 
ment. See Miczaika and Sinton, (16) pp. 54-55, where it is stated that the eye 
can separate points as close as one minute of arc. 

Another type of instrument for measuring azimuth and elevation had, in 
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place of the quadrant, a combination of rulers that formed an isosceles 
triangle in which one of the angles corresponded to the zenith distance or the 
elevation. Copernicus used instruments of this type. 

Large “equatorial” instruments were also available in which a circle 
used as a quadrant was pivoted about an axis parallel to the polar axis. The 
quadrant reading yielded declension and the amount of rotation yielded 
right ascension. The latter reading was obtained by presetting the right 
ascension value, and in order to obtain a reading accurate to within a quarter 
of an angular minute, the reading had to be taken to within one second in time. 
This seems to have been the accuracy objective, that is, angles within a 
quarter of a minute and time to within a second. Four clocks were used in 
order to obtain the required consistency, and measurements were made 
independently on three different instruments. 

There was also a “zodiacal” instrument in which an interior ring carried 
a representation of the ecliptic, but this instrument presented difficulties 
because of lack of balance. Because of this measurements were made on the 
“equatorial” or fixed system and were converted to longitude and latitude by 
trigonometry. There was another type of instrument that was used to measure 
angular differences on the celestial sphere. 

For further discussion of the material in this section, see Dreyer, (5) 
Raeder et al., {23) and Miczaika and Sinton. (16> 


5.12. Optical Instruments and Developments 

Thus, before the introduction of the telescope, increasingly sophisticated 
observations led to the Keplerian orbital formulation and Newton’s dynamic 
description of the solar system. The telescope removed all doubt about the 
heliocentric character of the solar system, but it also provided greatly in¬ 
creased accuracy to complement the far more precise Newtonian description. 
Photography provided great versatility for optical instruments and there has 
also been a tremendous improvement in time measurements. 

Telescopes are usually specialized for specific purposes, and angular 
resolution is not always a prime objective. But for a photographic instrument 
intended for angular measurements, the relevant aspect is the plate scale, that 
is, the angular variation that corresponds to a millimeter on the photo¬ 
graphic plate. Photography will probably permit a further resolution of 
about one in a hundred, but other instrumental limitations may negate any 
such resolution. Miczaika and Sinton (16) state that specialized telescopes have 
plate scales of 1" to 2*. Construction problems limit refracting telescopes to a 
resolution of 4.65/D seconds of arc, where D is the diameter of the objective 
lens in inches. This would correspond to a resolution of a tenth of a second 
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for even reasonably large instruments. For stellar parallaxes Smart (24> (p. 412) 
indicates a somewhat more refined resolution. 

The various ratios in the orbital scale of the solar system are rather 
precisely determined by Kepler’s third law and observations of the orbital 
periods. This permits one to express various distances in terms of the “astro¬ 
nomical unit,” the major axis of the earth’s orbit. Thus, if one orbital scale 
distance can be expressed in terms of the earth's radius, the description of the 
solar system can be completed. A transit of Venus across the disk of the sun 
did permit a cooperative parallax measurement of its distance from earth, 
and the planetoid Eros was also utilized for parallax measurements. Another 
procedure for determining the size of the solar system involves a precise 
prediction of satellite motion for a major planet. At different orbital positions 
this phenomenon has different apparent delays due to the time it takes light 
to travel the different distances involved. Since the speed of light is known, the 
distances can be determined. Modern radar has permitted relatively direct 
and accurate measurements of the distance to the Moon and Venus. Space 
probes also reveal the fine structure of the dynamic description of the solar 
system. 

Our understanding of the solar system involves a conceptual image 
including notions of space, i.e., geometry, time and objects subject to certain 
physical laws. These concepts are scientifically significant because they lead 
to intellectual mathematical experience that matches actual experience in a 
very significant manner. It is important to appreciate that both the intellectual 
image and the actual experience grew by interacting with each other. At 
every stage the conceptual image provided the format for the experience, and 
the latter, in turn, reacted on the former. The process is essentially inseparable. 

The astronomy of the solar system is an excellent example of how the 
horizons of human experience have been widened in conjunction with 
intellectual developments. The modern equivalent is astrophysics with its 
wide range of observational techniques, including spectroscopy, cosmic ray 
detection, and visual, radio, and even x-ray telescopy. The corresponding 
conceptual equivalent involves the modern understanding of atomic and 
nuclear phenomena and thermodynamic and cosmologic notions. These 
are expressed in a great range of mathematical concepts. The prediction 
arithmetic and numerical modeling require the capacity of modern electronic 
computers. But all this grew in stages that can be precisely determined in 
history from the sawtooth arithmetic procedure of the Babylonians. 

There is, however, one further contrast that must be drawn between the 
Babylonian arithmetic description of phenomena and certain competing 
notions. When a merchant at the beginning of a voyage offered sacrifices to 
Zeus or Poseidon, he acted on the assumption that these great spirits con¬ 
trolled the air and the sea and could be influenced to act in the merchant’s 
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favor. Thus, the behavior of the sea and air is explained “animistically ” On 
the other hand if the heavenly bodies move according to a mathematical law 
that predicts their motion, then clearly no spirit intervenes. Thus, there is a 
fundamental antithesis between an “animistic” concept of nature and a 
“mathematical description” of phenomena. A scientific approach always 
assumes that one is dealing with kinds of experience that have precise logical 
delimitations and for which relations can be expressed unambiguously in 
symbolic form. Animistic intervention is inconsistent with such relations. 

This antithesis is still completely valid even when the mathematical 
description involves the notion of probability. The biological theory of evolu¬ 
tion involves probability both in the mechanism of inheritance and prefer¬ 
ential survival but specifically denies animistic intervention in the evolution 
of species. Similarly, games of chance are played on the assumption that the 
outcome is subject to the mathematically described laws of probability. Any 
animistic intervention is considered cheating. 

There has been some confusion in this last regard because of the contrast 
between the deterministic and probablistic aspects of physics. Actually the 
latter notions are really complementary rather than antithetical. They both 
correspond to mathematical descriptions of phenomena and both are anti¬ 
thetical to any animistic description. 

For further discussion of the material in this section, see Miczaika and 
Sinton, (16) Smart, <24) Thackeray, (28) and Whitehead/ 31 * 

For further discussion of the material in this chapter, see Chace, <3) 
Dijksterhuis, (6) Heath, (10) Moon, <17) Neugebauer, (20) Neugebauer and 
Satz, (21) and Struik. (26) 


Exercises 


5.1. Find the formula for the Tartaglia solution of the cubic. Each time a root is 
taken, a number of possibilities appear so that this procedure apparently yields 18 forms 
for the solution. Are all 18 forms solutions? Can three distinct solutions be obtained? 
What are the various fields involved in solving this equation and what are the splitting 
fields? Can we have complex numbers in the algorithmic procedure even when all the 
roots of the original equation are real? 

5.2. Do the equivalent of the previous exercise for the quartic. One way of obtaining 
an auxiliary cubic equation for a quartic with roots a,, a 2 , a 3 , a 4 is to set up the equation 
with root a 1 a 2 -ha 3 a 4 . What are the other roots one should use? How do you set up the 
equation ? Suppose you have solved this cubic. How do you solve the quartic? What is the 
relation of y to a t a 2 -f a 3 a 4 ? 

5.3. In the transformations of the one-parameter Lorentz group, c is a constant 
and the ratio P = v/c is considered to be the parameter. Let T(P) denote the transforma¬ 
tion: T(P\ (x, f)“*(x\ f')> x' = (x—pct)(\—p 2 )~ 112 , ct' = (ct—px){\—p 2 )~ 112 . Show that 
nPs)= T(p 2 )T(P l ) for p, = (P^P 2 )/(\ +M 2 ). 
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5.4. A finite subgroup of the group of rotations can be associated with a set of unit 
vectors each of which is the axis of an element of the subgroup. This set of unit vectors is 
permuted by the elements of the finite subgroup. This permits one to determine all such 
finite subgroups and in particular those subgroups that leave an infinite lattice invariant. 
(See Jansen and Boon, (l3) page 334.) 

5.5. Describe the apparent angular motion of a point on a epicycle to an observer 
at the center of the fixed circle by a kinematic representation of moving vectors. Compare 
this procedure with those used by Apollonius as described in Van der Waerden/ 29 ’ 
p. 238-240. 

5.6. Consider two spherical coordinate systems, (a, P) and (A, /t), on a sphere, with 
ol and A being the longitude variable in each case. One can replace a by a = a 4- a 0 and A by 
// = A + / 0 so that f° r the coordinate pairs (a, P) and (7, /i), a and 7' are measured from a 
common intersection point Q 0 of the equatorial circles. Let r. be the acute angle between 
the equatorial circles. We can also replace 7' and n by new variables 7 and with 
7= ±7' and //= choosing the appropriate sign so that 7 and a have the same sign 
on adjacent sides of k and the distance between the positive p and ^ poles is e (Figure 5.4). 


Mpole 



Show that 

tan 7=cos t: tan a—sin e sec a tan P 
sin £ = sin c cos p sin an-cos e sin p. 

[Hint : Take a rectangular coordinate system x,y, z corresponding to the (a, P) coord¬ 
inates for which x = cos a cos /?, y=sin a cos p,z = sin p. Rotating an amount e about the 
x axis will yield the corresponding Cartesian system for (7, /!).] 

Apply this to the equatorial and zodiacal coordinates on the celestial sphere. 
Express the relation between azimuth and elevation and the equatorial coordinates of a 
star using siderial time. 

5.7. Given 

dv/dt _ n 
(1+ecosv) 2 (1—e 2 ) 3/2 ’ 

let e* = e/[l + (1 — e 2 ) ll2 ~\ and f=7 for y = 0. Show that n{t—l)=v + Z^ =1 (2( — l)V*"x 
[0 -e 2 ) l/2 + 1/n] sin nv. 
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*•11 




Ml '* 

•Hal '* 


[Hint: Introduce the variable F=exp(ic). By partial fractions 
dv/dt _ -iV(dV/dt ) 

(1 + e cos v) 2 [V+ (e/2*V 1 +1 )] 2 

-i [ -«* e*-‘ 1 / 1 1 \~\dV 

(l-* 2 )[<^ + <’*) 2 (K+e*-‘) 2+ (l-e 2 ) ,,2 VK+e* V+e*-')jdt' 

Integrating and evaluating the constant of integration yields 

(1-e 2 ) 12 v - ( —^^;^+log(l+e*K _, )-log(l±e*K)+log V = in(t -T). 

Expanding by power series yields 

X (- iyV*"[( 1 —e 2 ) 1 ' 2 + l/p]( V" - V -’)+ iv=in(t-t).] 


5.8. The right ascension a of the sun is given in terms A by the formula tan a=cos £ 
tan A, where £^ 23° 27'. Let 


1 —cos £ 
1 -I- cos £ 



Show by means of the sinusoidal, A = exp (/a), that 

30 

a=A + ^ D — 1 ) n l H /n] sin 2wA. 

n = I 

5.9. A relationship such as r-f < Wr)=x with 0(r) = L^ = x u n sin nr such that Z* = x n|cr II | 
= A < 1 has the property that given x, the corresponding value of v can be obtained by 
repeated substitution. Show that if t? 0 =x, r n = x — <P(r B _ ,), then r„->r such that t?+<D(r)= 
x. Program this. 

5.10. Let Af(a) be the three-dimensional vector (sin a, sin 2a, sin 3a). Show that 
X x = X(n/A), X 2 =X(n/2\ X 3 = X(3n/4) are mutually orthogonal. If we assume that a 
function ¥(x) is in the form a , sin x -I- a 2 sin 2x + u 3 sin 3x, it follows that if we know the 
values of T for x = tt/4, n/2. and 37r/4, we can find the coefficients a x , a 2 , u 3 . What are the 
formulas? If we use this to invert the relation r + <D(r) = x into v — x-F^V(x\ how would 
one investigate the error? 

5.11. Generalize ihe relations of the previous exercises to the case in which 

Af(a) = (sin a, sin 2a.sin (2n — l)a) and x t , = pn/2n for p = 1. 2n — 1 and obtain the 

inversion formulas. We can also generalize this to finite Fourier series. Show that if 

A"(a)=(l, sin a, cos a, sin 2a, cos 2a.sin ka, cos fca), 

then for a^ 2 /^ the inner product is 

X(a)- X{p) = \{\-cos [(fc+ lXa-0)] + sin [(fc+ IK x~P)] cot [i(a-/0]} 

and X(a) • X(a )-k-\- 1. How does this permit one to obtain 2k + 1 mutually orthogonal 
vectors? How can this be used to invert a Fourier series? 

5.12. Show that if tan a = p sin <r/(l — p cos a), then 

V P n • 

a = > — sin no. 
n 

5.13. On the basis of the table in Smart* 24 * (p. 422) discuss the synodical conjunc¬ 
tions of the earth with Mars. How large are the corrections from the mean? For Venus 
find the points of maximum elongation from the sun on the westward or eastward side. 
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5.14. Given the first line of the table on p. 110 in Neugebauer,* 20 * construct the 
rest of the table using modern values. Show that if we have for M' = M — ri>, |Af'| <2n, 

X = M + ip(M —d))±n = M' + (b+\p(Xl”)±n 


for 

i[/(x)=a x sin x + u 2 si n 2x+a 3 sin 3x, 

and S is the change in M for a synodic month, then the longitudinal change AA in a 
synodic month centered at M' is 

AA = <5 -t- 2a i sin (6/2) cos M' 4* 2a 2 sin 6 cos 2.VT -I- 2 a 3 sin (3<5'/2) cos 3M\ 

What are the maximum and minimum values of AA? 

5.15. Discuss the epicycloidal approximation to the orbit of Mars. How would the 
brightness of the planet be affected by this approximation? 
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Natural Philosophy 


6.1. Analysis 

During the Renaissance there was a tremendous European interest in 
mathematics and it produced a more sophisticated and effective algebra. 
This algebra was combined with various geometric procedures and other 
concepts to yield the methods of analysis. In classical mathematics quantita¬ 
tive methods were applicable only to “numbers,” i.e., natural or mixed, and a 
limited range of geometric magnitudes. The new analysis represented an 
extension of quantitative procedures to a much larger domain of experience 
including, kinetics, dynamics, the properties of matter, and to a far more 
general “analytic” geometry. This was the critical intellectual achievement 
that produced the modern exact sciences. 

The detailed history of this development is available in such references 
as Klein, 03) Boyer, (4) Smith 09) and Robinson° 8) Our immediate interest 
is in the rich variety of intellectual elements that were part of this development 
and the reasons they were ultimately transformed into the modern form of 
mathematics. There were at least two distinct conceptual forms for the 
calculus, and both were different from our present-day mathematics. It is 
interesting to observe that a good deal of the terminology and symbolism 
of these early forms of the calculus has been retained, but in order to fit such a 
term or symbol into our present formulation, it is redefined in a way that 
seems completely different from the obvious interpretation of the term or 
symbol. 

The European development of algebra had two complementary aspects. 
One of these was a progressive change in attitude accompanied by an evolu¬ 
tion in notation so that, finally, the manipulation of equations was considered 
to have logical significance equivalent to the arguments of geometry. Thus, 
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one could set down an equation and by justified manipulations obtain 
necessary characteristics of an unknown, or similarly, one could start with an 
expression for a number and establish desired properties. The other aspect 
of this algebra was an increase in capability to include the solution of the 
cubic and quartic equations and various formulas involving a general n that 
could be established inductively. For example, the sums of powers of integers 
are given by 

1+2h -hn = n(n+ 1 )/2 

1 2 + 2 2 H -h n 2 = n(n+ lX2n+ l)/6. 


These formulas readily lend themselves to obtaining areas by classical 
“methods of exhaustion.” Thus, the area under the curve y = x 2 between 
x = 0 and x= 1 can be boxed in between two sets of rectangles with areas 


*= I 

j= i 


O'-1) 2 


n • n 


and 






respectively. The above formula yields that A'=% — \/2n + \/6n 2 and A"=% 
+ 1/2m + l/6n 2 . Since we can take n arbitrarily large, A must be 3 . For integral 
n, the area under y = x n between x = Oand x = a can be shown to be a n + l /(n -l-1). 

Another expansion of quantitative procedure was represented by the 
principle or axiom of Cavalieri. In the case of volumes this principle would 
state that two volumes are equal if they have equal bases and if the cross 
sections parallel to the base at equal distances from the base are equal. Thus, 
if the cross section of one solid is the sum of the cross sections of two other 
solids, the volume of the first is the sum of the volumes of the other two. As in 
many other developments of a similar nature at this time, this principle of 
Cavalieri had a wide range of practical valid application. This range certainly 
exceeded the classical solids, but there was no conceptual framework in which 
it could be formulated so that its exact domain of validity could be estab¬ 
lished. The “proof” given by Cavalieri appears to be a kind of limiting 
process that we would consider as valid only under a considerable number of 
additional assumptions. 

This Archimedean comparison of cross sections could be applied to both 
moments and linear motion. The moment of the area of the triangle under 
y = x and between x = 0 and x= 1 considered about the y axis is equal to the 
volume of a pyramid with base perpendicular to the x axis and with cross 
section of area x 2 . This is readily generalized so that moments of areas be¬ 
tween curves with positive ordinates and the x axis and between specified x 
limits and around the y axis can be equated to volumes. 

For linear motion it was known that if one takes time along an axis, say, 
the x axis, and at each abscissa erects an ordinate equal to the speed, then the 
distance covered in the time between and t 2 equals the corresponding area. 
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It was also known that if one takes the ordinate equal to the distance covered 
from a fixed time f 0 , then the slope of the tangent to this curve was equal to 
the speed. In a sense this was a form of the fundamental theorem of the 
calculus that was of great practical significance. 

For further discussion of the material in this section, see Boyer, (4) 
Klein, 03 * Robinson, 08 * and Smith. 09 * 


6.2. The Calculus 

During the seventeenth century, procedures for finding tangents by 
algebraic methods appeared in many forms. The “symptom” relations were 
translated into algebraic forms that readily permitted these manipulations. 
For example, let us find the slope of the tangent to x 2 + 2y 2 = 3 at (1,1). 
Consider a point (x , t /)=(l +A, 1+fc) on the ellipse. Thus, 2h + h 2 + 4k 
+ 2fc 2 = 0. Dividing by h yields 

(2 + h)+j(4 + 2k)=0 
h 

Now if we consider the point (x', /) as moving along the ellipse toward (1,1), 
then the ratio k/h is also a changing quantity, i.e., a “variable” that approaches 
—\ as (x', /) moves to (1, 1). From the practical point of view all one really 
had to do was set h and k equal to zero in the above equation, ignoring the 
fact that they occur in the ratio that one considers a new quantity. 

Clearly, with the idea of motion one has a considerable number of new 
concepts, for example, “variable,” “limit” as in the above, and the notion of 
“function,” which is a relation between variables analogous to the notion of 
ratio as a relation between magnitudes. Newton’s notion of limit can be 
summarized by Lemma 1 of Book One of the Principia {6) : “Quantities and 
ratios of quantities which in any finite time converge continually to equality 
and before the end of that time approach nearer the one to the other than any 
given difference become ultimately equal.” In the next lemma, the “boxing” 
process for an area is thought of as being carried out in time with the sides of 
the rectangles “diminishing in infinitum .” Similarly the ratio of chord to arc 
on a curve is considered as the end points “approach one another.” He insists 
in the Scholium at the end of this sequence of lemmas that he is not concerned 
with “indivisibles,” i.e., infinitely small quantities, but with the “ultimate 
ratio” of ratios of finite quantities. Of course, when the two magnitudes in a 
ratio are zero there is no ratio, but the algebraic process finds one. To bridge 
this gap some concept of “continuity” or simply the existence of the limit is 
required. For a moving “variable” this notion of continuity is intuitive and 
implicit. 
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Newton incorporated these notions of limits into the calculus in the 
form of the theory of “fluxions” and “fluents” of “variables.” In present-day 
terminology this would be the theory of derivatives or indefinite integrals of 
functions of time. For these the available algebra, including infinite series, 
permitted a wide range of application. The limiting processes were to a 
considerable extent intuitive, with a free exchange of the order in limiting 
processes. Consider, for example, the differentiation of x" for n nonintegral. 
Let the increment of x be denoted by 0. Then Newton had established that 

(x+0Y'=x n +nx n - l 0 + n(n — l)x n - 2 0 2 /2-\ -. 

This formula had been inferred by Newton by analogy with the case of n 
integral. For n=| this can also be justified by multiplying the series by itself 
term by term. The ratio of the change of x" to 0, the change in x, is that of 

Hx n_1 +n(n — l)x"~ 2 0/2n-to 1 and the limit is taken by setting 0 to zero. 

This can be considered to be either an exchange of order for the limits or an 
assumption of continuity. 

There was of course a considerable amount of criticism of Newton’s 
calculus and that of Leibnitz. This did not inhibit mathematicians of the 
succeeding era from obtaining and using results such as 

i=l—2-h3—4H-, 

which is a consequence of an interchange of limits. In general when rigorous 
standards were established such illegitimate offspring were viewed with dis¬ 
dain. However, later, methods of handling precisely such relations were 
established and the results were justified. 

It has not been unusual in mathematics to have the symbolism and 
formal manipulation of a subject extended beyond the original conceptual 
development. The consequent effort to increase the conceptual framework 
has often resulted in important mathematical advances. The notion of higher 
dimensions in geometry, of projective geometry with its points at infinity, 
algebraic ideal theory, and the theory of distributions are examples. 

Thus, while logical criticism could be directed at the new analysis, its 
most important characteristic was the expansion of mathematics, which 
permitted one to express the new scientific outlook. In Newton’s Principia 
this was done for the solar system as far as it was known. Kepler’s laws were 
shown to be equivalent to Newtonian mechanics and universal gravitation, 
and a tremendous range of phenomena such as the tides and precession of 
the equinox was explained. The effect of this could only be described as an 
intellectual revolution. There were also practical effects of matching import¬ 
ance. The theory of dynamics was essential for the development of industrial 
machinery and the geometric aspect of analysis was adequate to determine, 
for example, the desirable shape of gear teeth. 
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The student will find the history of the development of the concepts of 
mathematics fascinating as available in the references. There was, of 
course, a rival development of the calculus due to Leibnitz that involved 
different notions. Thus, instead of a figure being "generated” by a cross 
section of one less dimension, the figure was considered to be the sum of an 
"infinite number” of figures of the same dimension but with “infinitesimal 
height.” Algebra was used on “differentials” to obtain differential coeffic¬ 
ients,” i.e., either derivatives or partial derivatives. But just as in Newtonian 
calculus much of the power of the new analysis was based on the fundamental 
theorem of the calculus, i.e., on antidiflferentiation. 

While the Leibnitz manipulations were essentially the same as those of 
Newton, a rather vague "principle of continuity” was supposed to be applic¬ 
able to the ratio of differentials. This is discussed in the last chapter of 
Robinson.' 18) It is, of course, the Leibnitz notation that has survived, but 
this is now accompanied by a full kit of explanations and definitions so that a 
Leibnitz manipulation can be replaced by a modern mathematical pro¬ 
cedure. On the other hand the “nonstandard analysis” described in Robinson’s 
book gives an “extension” of the real number system by set theoretic pro¬ 
cedures in the modern manner that permits one to interpret directly the 
language of Leibnitz with “infinitesimally small” and “infinite” objects. 
However, a satisfactory interpretation is not facile. 

For further discussion of the material in this section, see Boyer, 
Robinson, <18) and Smith.' 19 ’ 


6.3. The Transformation of Mathematics 

The intellectual revolution consequent on the Newtonian scientific 
advances produced a philosophical approach that married mathematics 
and experimentation. There was absolute truth accessible to the human 
mind and expressible in mathematical form, but the choice of the mathe¬ 
matical form had to be verified by experiment on the mathematical con¬ 
sequences. Realms of experience were governed by mathematically expressed 
laws, and there were complete intellectual satisfaction and important prac¬ 
tical benefits in deriving mathematical relations and verifying them by 
experience. 

This philosophical viewpoint could be rather naturally associated with 
the peripatetic (i.e., Aristotelian) notion that mathematical concepts arose 
from experience by abstraction. But the nature of magnitudes available from 
experience had been given by Euclid and Archimedes and did not include 
infinitesimal quantities or infinite magnitudes. Infinite aggregates were in a 
somewhat different category. Thus, there were unsatisfactory aspects to the 
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calculus of Leibnitz and in a somewhat different way to that of Newton. 

The concern for logical consistency ultimately led to a complete recasting 
of mathematics. But one major aspect of this was the determination of what 
was logically satisfactory, a motif that became dominant in the nineteenth 
century. But in the eighteenth century and even into the nineteenth, the major 
motivation for the development of mathematics was to increase capability 
as part of the maturing of the exact sciences. 

Logical arguments dealt with equations rather than geometric figures. 
Newton’s Principia was an extension of geometry, but the emphasis of 
succeeding mathematicians was on “analytic” notions such as equations and 
functions, derivatives, integrals, and series. Both Newton and Leibnitz used 
antidifferentiation to replace “summation” approaches to integration. But 
this in turn was generalized to the use of differential equations to solve 
problems as the central procedure of analysis. 

Consequently, mathematics dealt mainly with the numerical value 
associated with a magnitude. If one chooses a unit length and fixed origin 
of a Euclidean line, then the notion of ratio can be equated to that of “point” 
on this line. This yields a concept of “real number,” including negatives. 
Similarly the points of the plane can be associated with complex numbers 
(see Smith" 9> ). It was these concepts of real and complex number that were 
used to express the analytic form of many new concepts, e.g., surfaces, 
multidimensional analysis, differential equations. 

The solution of differential equations often involved infinite series. The 
formal manipulation of “infinite sums” and “infinite products” went through 
a period of great flexibility and questionable logical rigor. But by the begin¬ 
ning of the nineteenth century, the requirement of convergence led ultimately 
to the modern formulations. 

The notion of function and the associated notion of limit passed 
through numerous phases. We have indicated the idea of a function as a 
relation of moving “variables” and the corresponding “approach to a limit.” 
The Taylor series for the basic transcendental functions and the algebraic 
functions were available and were clearly effective in computation. This 
suggested the definition of a function as a formal Taylor series and, cor¬ 
respondingly, the use of infinitesimals of various orders in the limiting 
processes. The physical notion of “motion” is thus eliminated in favor of 
computation. A “variable” is then a symbol as in elementary algebra, and 
the functional relation is operational, i.e., a power series. The formal Taylor 
series concept ran into difficulties with the appearance of the more general 
possibility of a Fourier series. Analysis now dealt exclusively with real 
numbers rather than with the ratio of magnitudes, and notions such as 
continuity and limit were expressible in terms of inequalities. By means of 
Fourier series one could construct continuous functions that were not 
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differentiable. It became evident that only the abstract correspondence 
notion or the equivalent set-theoretic notion of the graph provided the 
appropriate generality for the function concept. Also, these did provide the 
possibility for an exact formulation of numerical procedures. The use of 
inequalities is readily translated into set-theoretic ideas. The latter had many 
inventive and useful aspects. Thus, the notion of compactness led to a 
satisfactory discussion of the properties of continuous functions. With the 
introduction of the constructive possibilities for infinite sets, this new form 
of analysis was stabilized and received intensive development. 

The precision and inventiveness of modern analysis was highly signifi¬ 
cant for further scientific development. But the direct practical use of the new 
analysis was far clumsier than the earlier infinitesimal methods. This is 
evident in any calculus course. The older procedures were reformulated as 
theorems in the new mathematics using the older notation. Thus, discussions 
may still be expressed in the earlier language but have a modern significance 
in terms of numerical procedures. 

Thus, mathematics was transformed into a purely numerical subject by 
eliminating all intuitive notions of space and time. In fact the tables were now 
turned. Notions of space and time could be described precisely in a numerical 
fashion. But the conceptual basis of mathematics was irrevocably based on the 
concept of infinite sets. One can, of course, claim that as far as scientific 
applications are concerned only the symbolic formulation is essential and 
that really the conceptual basis plays only a heuristic role. 

The discovery of non-Euclidean geometry indicated that mathematics 
did not necessarily arise from some “true” picture of the universe but was 
essentially inventive, and this was confirmed by the nature of this new logical 
formulation of mathematics. The mathematical theory of a science does not 
have absolute validity but is justified by its agreement with experience, 
specifically in regard to probing experiments. Mathematics is a conceptual 
form for expressing experience patterns or in modern jargon a “language.” 
It does not have factual content and the question of “truth” is irrelevant. The 
existence of a form of mathematics is simply a matter of “freedom from 
contradiction,” i.e., just the capability of a satisfactory logical discussion. 

For further discussion of the material in this section, see Boyer. (4) 


6.4. The Method of Fluxions 

Let us begin by quoting Bishop Berkeley in his famous book of criticism 
The Analyst. A Discussion Addressed to an Infidel Mathematician (1734) 
(see Smith, <19) p. 628): 
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The Method of Fluxions is the general key, by help whereof the modern mathematician 
unlocks the secrets of geometry and consequently of Nature. And as it is that which hath 
enabled them so remarkably to outgo the ancients in discovering theorems and solving 
problems, the exercise and application thereof is become the main if not the sole employ¬ 
ment of all those who in this age pass for profound geometers. 

The latent sarcasm surfaces at the end of this brief quotation, but his 
description of the mathematics of the eighteenth century is quite apt. In what 
way then did the calculus permit the “profound geometers” “to outgo the 
ancients”? The answer is the following ways: (a) They reasoned with equa¬ 
tions instead of or in addition to diagrams, (b) They had analytic tools to 
treat far more general geometric objects, (c) They dealt with new magnitudes 
of broad practical significance. 

The basic task of geometry had been to structure the quantitative aspects 
of geometric magnitudes. The calculus permitted the introduction of many 
new magnitudes to expand the domains of natural philosophy that were 
subject to quantitative control. On the most elementary level, these magni¬ 
tudes dealt with the notions of particle movement, but one also had such 
concepts as that of a field of force or field of fluid flow and ultimately a con¬ 
ceptual framework for the behavior in time of a substance distributed in 
space. 

The “Method of Fluxions” contains a conceptual structure that is 
highly significant for modern applied mathematics, and we will discuss it 
without emphasis on the historical development. One can consider geometry 
as having an analytic formulation, but the major adjustment was that one 
considered curves and surfaces, which were associated with the elementary 
notions of lines and planes only in an “infinitesimal” sense. This meant that 
the final result of a discussion had to be obtained by integration, that is, by 
solving a system of differential equations. 

In classical geometry the geometric definition of a locus led to equations 
such as the “symptom.” But one can use operational equations on the coord¬ 
inates of a point to specify the geometric loci, greatly expanding the curves 
and surfaces that can be considered. Alternatively one can use the para¬ 
metric form based on the notion of function to specify curves or surfaces. A 
curve in three-space is given by three equations, x* =/ f (0, i= 1, 2, 3, defined 
for a suitable range of r. A surface is given by x' =/'(r» s), where r and s vary 
over an appropriate region in the plane. To apply the desired analysis 
differentiability conditions are required, and these do yield a notion of 
dimensionality. A more modern notion of manifold uses a number of such 
parametric representations of a surface “patched” together in a suitable 
way. 

If the/‘(r) that specify a curve have continuous first derivatives, there is a 
natural parameter, the distance s along the curve from a fixed point. If the 
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/‘(f) have second derivatives and the parameter t is considered to be the time, 
then the curve can be considered to be the path of a particle, and one can 
specify the vector notions of velocity and acceleration and the scalar con¬ 
cept of kinetic energy. 

Newtonian dynamics utilizes two complementary notions of a moving 
particle and a field of force. A field of force involves a region, at each point of 
which a vector is specified. Both the motion of the particle and the force 
vector are to be expressed in a Cartesian coordinate system for which 
Newton’s third law is valid. This means that there is a property of the particle, 
such as mass or electric charge, that couples the force and the acceleration. 
If one has found one such coordinate system, any other that is in uniform 
translation motion relative to the first will also do. 

In the case of gravitational attraction between N particles, the position 
of the particles determines the field of force. Thus, each of the 3 N Cartesian 
coordinates, ...,y 3N , of the particles satisfies a differential equation 

my k = F k (y\...,y 3N ). 

This system of second-order differential equations determines the motion of 
the N particles, provided the positions and velocities are given at an instant of 
time. In the case of the solar system numerical solutions of the equations are 
available to within the observational accuracy. The initial success of Newton’s 
theory was its application to astronomical problems. 

The notion of work is associated with that of a force acting through a 
distance. If a particle moves through a field of force, the field will do work on 
the particle that can be expressed as an integral. If the particle is displaced the 
vector amount (dx\dx 2 y dx 3 ) due to a change of parameter dt, then the work 
done by the field {F 1 , F 2 , F 3 } is d\V=F l dx l + F 2 dx 2 + F 3 dx 3 , and thus the 
total work done by the field as the particle transverses the path (£ is 

-{'KM- 

This expression is independent of the choice of the parameter t, which need 
not be the time. If the field of force does not depend on the time, this integral 
depends only on the curve (E and is termed a “line integral.” The work 
expression has been obtained on the basis of an analysis using the inner 
product of an infinitesimal displacement with the field-force vector. 


6.5. The Behavior of Substance in the Eulerian Formulation 

Euler proposed a procedure for describing the behavior of a substance 
that is particularly effective in the case of a moving fluid, but it also has wider 
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applications. One considers a region of space with a coordinate system 
x*,x 2 ,x 3 and an interval of time. At any instant in time, f, the substance 
occupies part or all of this region, so that its behavior can be described by 
numerical procedures on functions of x l ,x 2 , x 3 , and t. Thus, for example, 
we can obtain an approximate description by forming, for each of a discrete 
set of values of f, a spatial partition with subdivisions of small diameter. 
Usually one can assume that any discontinuities in the behavior of the 
substance occur on the boundary of subdivisions, so that one can treat the 
substance within a subdivision as a particle with properties associated with 
some point in the substance. The theoretical particle has a mass similar to 
the mass of the substance in the subdivision and has a velocity and accelera¬ 
tion or other relevant properties. The values of these quantities are given by 
certain functions of x\ x 2 , x 3 , and f, and one can indicate the limitations on 
the nature and behavior of the substance so that partitions of this type can be 
effectively used to approximate the behavior of the substance. Such approx¬ 
imations are still useful even if one insists on the atomic structure of matter, 
since in many cases one can have a very fine partition of the material to justify 
the particle approximation, and yet this partition is coarse relative to the 
atomic structure. 

It is practical then to consider the substance as being described by 
functions of space and time in which the ultimate atomic fine structure is 
ignored. Quantities having practical experimental significance for an instant 
of time, r, arise by integration over a volume or in certain cases over a surface. 
The properties of the substance that one associates with the integrands— 
that is, with functions of points in the substance—are termed “intensive” 
and the quantities obtained by integration are called “extensive.” Thus, the 
density, p(x l ,x 2 , x 3 , f), is an intensive function that yields the mass in a 
region, 21, by integration 

m- jjj „dv. 

21 


The velocity of a point in the substance is a vector, v(x\x 2 , x 3 , t), with 
components v\ v 2 , t; 3 , which yields the momentum, P, in a region, 21, at an 
instant of time by 



and the kinetic energy 


Sec. 6.5 • Substance in the F.ulerian Formulation 


139 


For a homogeneous substance intensive quantities do not depend on the 
amount of substance, while extensive quantities are proportional to the 
amount. 

At an instant of time, t, an intensive quantity u(x‘, x 2 , x 3 , f) can be 
considered to denote a property of the substance associated with a small 
subdivision of the spatial partition. We suppose that this amount of the 
substance is identifiable, at least for a small time, dt. To determine how this 
local property, u, of the substance changes in time, dt , we must consider the 
motion of the substance. Thus, the change 

du = u(x' 4- 1 ; 1 dt , x 2 + v 2 dt, x 3 + v 3 dt, t+dt)— u(x‘, x 2 , x 3 , t) 


( du . du , du , <3u\ 

a?' + i? v+ n?’ + n;r 

Hence we can define the “intrinsic rate of change” of u: 


Du du . du , du 
Dt dx 1 1 + dx 2 L + dx 3 



This gives the rate of change of a property associated with a moving part of 
the substance. 

Surface integrals are also used to describe the behavior of substances. 
Consider a surface or part of a surface given by a set of equations, x' = x‘(r, s) 
for 0<r < 1,0<s< 1. We partition the unit r interval and the unit s interval, 

0=r o , r t ,.... r„= 1, 0=s o »Si.s,= l, and consequently the surface into 

quadrilaterals, <r (J , for which r,_i<rs*r„ S;-,<S<S;. Let F i} be a point in 
a,j with corresponding P, and s'j. Let 

(dx} dx 2 dx^ \ 

R = Rfa, 5^)=(— fa, s'j), — fa, S’j), fa, s' )J 

/ dx 1 dx 2 3x 3 \ 

S = S(rJ, s'j) = to> s'j), s j)> W* Sj) J- 


The vectors R Ar and S A s lie in the tangent plane to the surface at F u and 
determine a planar quadrilateral that approximates Cij when the tangent 
vectors are appropriately located. The vector (R x S)Ar As is normal to the 
surface at P' i} and has the area of the planar quadrilateral as its length. Thus, 
(R x S)Ar As = n A A, where n is a unit vector in the normal direction and A A 
is the quadrilateral area. This construction permits one to approximate 
various quantities by appropriate sums. We will assume that n always 
appears on the same side of the surface. This side will be the “upper side.” 

Let us consider the volume of substance that passes from the lower side 
of the surface to the upper in a short period of time dt. Let v' denote the velocity 
of the substance at F {j so that, in the time dt , the substance at F t j will move to 
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the point Fj + v'dt. We consider the planar quadrilateral approximating 
a u as also moving in this way so that that substance that passes through a i} 
in time dt occupies a parallelepiped with the a i} quadrilateral as a base and 
v'dt as slant edge. This parallelepiped has altitude {tfdt)n and volume 
dV=if ■ n dA dt = v' ■ (Rx S)dr ds dt. This yields 

v • (R x S) dr ds. 

One also has that the rate at which the mass of the substance passes through 
the surface is 



<m 

~di 


II 


pv(Rx S) dr ds. 


For either of these integrals we can proceed in the following way: 
jj v(RxS) drds 

ff , d(x 2 , x 3 ) J . ff 2 d(x\ x 1 ) , , ff , d(x‘, x 2 ) , . 

= JJ p sm***)) v ^)~ drds+ !\ v -^sr drds - 

_8(x 2 , x 3 ) dx 2 dx 3 dx 3 dx 2 


We assume that 


•^23 — ' 


d(r, s) dr ds dr ds 

is continuous and that the surface can be divided into a finite number of 
regions on which either J 23 is zero or such that the relations x 2 = x 2 (r,s), 
x 3 = x 3 (r, s) can be inverted and r and s expressed as functions of x 2 and x 3 . 
In the latter case x 1 will also be a function of x 2 and x 3 and we have 

JJ f 1 ^ *-lJJ pW 2 . x 3 ), x 2 , x 3 ) dx 2 dx 3 

= (say) JJ v l dx 2 dx 3 . 

By a proper interpretation of the integrals, we may write 
clV C C 

Tt = II (t>‘ dx 2 dx 3 + v 2 dx 3 dx 1 + v 3 dx 1 dx 2 ). 


and we have a similar expression for dM/dt. 

The integral 

J/v, x 2 ,x 3 )dx*=^r d £-dt 

is unchanged if the parameter in the parametric equations for G is changed. 
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Replacing t by another parameter, f 7 , requires that t be a function /z(r') of 
t\ and the chain rule for differentiation shows that the integral is unchanged. 
Similarly if one replaced the pair of parameters, r, s, for a surface by another 
pair, r 7 , s 7 , then r and s are functions of r' and s' and one can show that 

d(x 2 , x 3 )_d(x 2 , x 3 ) d(r, s) 
d(r\ s') d(r, s) 8 (r\ s') * 


The rule for changing variables in a double integral will now show that the 
surface integral is not affected by a different parametrization. 

The three expressions /*dx a , v l dx 2 dx*+v 2 dx 3 dx l +v 3 dx l dx 2 , and 
dx l dx 2 dx 3 that appear under the integral signs are examples of “differential 
forms.” The procedures for handling such forms and the associated integrals 
are well developed (see Flanders 118) . A product such as dx 2 dx 3 or dx l dx 2 dx 3 , 
which occurs in these forms, is not the usual product of differentials but 
should be interpreted as 


dx 2 dx 3 = 


d(x 2 , x 3 ) 
d(r, s) 


dr ds , 


where r and s are independent variables. Consequently we have such proper¬ 
ties as dx 3 dx 2 = -dx 2 dx 3 and dx 1 dx 1 =0. 


6.6. The Generalized Stokes’ Theorem 

There is a very important relation involving differential forms that is 
illustrated by Gauss’ theorem. Suppose ® is a surface bounding a three- 
dimensional region 91 Then 

JJ (Z 1 dx 2 dx 3 +f 2 dx 3 dx 1 +/ 3 dx 1 dx 2 ) 

- Jlf(!' + i? + i?) j *' dx * dx3= /// <div dx2 dx> ■ 

a a 


This permits one to associate an intensive quantity, div f, with a vector field. 
For example, suppose that the region is included in the part of space 
occupied by a certain substance at time t and has boundary 93. The substance 
in 91 will change its volume at a rate given by 


dV 

dt 


JJ (v l dx 2 dx 3 + v 2 dx 3 dx 1 + v 3 dx 1 dx 2 ) 

$ + j? + j?) 4x ' dx ‘ dx ‘- 
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where r^pfaSx^x 3 , t) is the ith component of the velocity vector for the 
substance. Clearly we can regard div v as a growth rate or local coefficient 
of expansion for dV-dx l dx 2 dx 3 , i.e.. 



'dv 1 dv 2 di? 3 \ 
dx 1 + dx 2 + dx 3 ) 


dV. 


An incompressible substance must satisfy div r=0. In vector notation, one 
can write Gauss’ theorem as 



Suppose now that 21 is a fixed region in space occupied by a substance. 
The mass M in the region 21 is given by M = J /S p dV, and the decrease in 
mass in the region is 



® 


But this is also the rate at which material is leaving 21 through the boundary 
93 of 21: 


-JJ/W 


(pi? 1 dx 2 dx 3 + pv 2 dx 3 dx 1 + pv 3 dx 1 dx 2 ) 


3i 


33 



8(pv 2 ) d(pv 3 ) ~I 
dx 2 dx 3 


dV. 


Since this holds for any such region 31 one must have 

dp d{pv l ) d(pv 2 ) d(pv 3 ) 
dt dx 1 dx 2 dx 3 


Thus, the density of a substance satisfies this equation, which is termed “the 
equation of continuity.” 

A similar result is Stokes’ theorem. Let © be a part of a surface bounded 
by a curve (£. Then 




dx 1 dx 2 
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Stokes’ and Gauss’ theorems are three-dimensional examples of a more 
general result. Suppose we have an integral applicable to a /c-dimensional 
subspace in an n-dimensional space. Then 

HI £/''. ‘“dx il ... dx' 1 

. -j 

31 

where 93 is the /c-dimensional boundary of the (/c + l)-dimensional region 31 
The subscripted exponents are ordered <i 2 < —, and the notation [ij 
indicates that ij is omitted from the sequence. The second integrand is 

obtained from the first by replacing each/* 1 .** by (8f il . ik /dx*)dx* and 

using the manipulation rules for differential forms, including zeroing a prod¬ 
uct of differentials if a differential is repeated and changing signs if two 
adjacent differentials are interchanged. This relation between differential 
forms is indicated by d. Thus, if cu is a /c-dimensional form, then 

A/-®* 

® 91 

This is called the“generalized Stokes’ theorem” 

The theorems of Gauss and Stokes are used to transform relations 
between integral expressions into partial differential equations and con¬ 
versely. Since experimental results must involve quantities obtained by 
volume or surface integration, this associates the experimental relations with 
partial differential equations. Techniques for solving partial differential 
equations provide, therefore, uniform procedures for solving a wide class of 
problems. Notice that integration can yield relations that are good to a high 
degree of approximation even when a substance is assumed to have an atomic 
or molecular fine structure. 

Let us discuss some examples of this process. In a field of force F, the 
work done by the field on a particle that has been moved along the curve (£ 
is fffff 1 dx 1 +F 2 dx 2 + F 3 dx 3 ). If C extends between the points P 0 and P u 
then it may be that for any other curve (T within a limited distance of (E, 
the work for the curve (T is the same as that for (£. If we take a surface S such 
that £ lies on S and take another curve £' on S near £ and also extending 
from P 0 to P,, then the combination curve £* consisting of £ and £' in the 
reverse direction bounds a part of S, 93. The work done in going around £ 
is zero, and thus, 

0=J^ F I dx a =|| (curl F)-nd/1. 

® 
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Since S is relatively arbitrary, curl F is zero and similarly curl F =0 implies 
that the field work is unchanged by a limited change in the path of integration. 
This result holds even when the field has a line or lines of singularities such 
that encircling such a line involves a nonzero amount of work, but the work 
is unaffected by a variation of (E, which does not cross a line of singularities. 

If there are no such lines of singularity, the experimental result that the 
work is independent of the path is expressed by stating that there is a function 
(f> of P u (x 1 , x 2 , x 3 ), such that 

0(x' x 2 , x 3 )= F® dx®, 

and this formula yields grad d» = F. (Changing P 0 changes <j> only by adding a 
constant. If there are lines of singularity, <p is “multivalued,” i.e., assumes one 
of a number of discrete values.) The necessary and sufficient condition that 
there be a </> such that grad <j> = F is that curl F=0. In terms of differential 
forms, <p is considered to be a differential form of 0 order, that is, of zero 
degree in the differentials dx 1 , and hence grad <f> = F can also be written 
d<j> = F * dx®. Again where there are no singularities we can differentiate F® 
again and we have 

0=d(d<£)=(curl F) n dA = [curl (grad #)] • n dA 


'd^_d 2 <f>8 2 <f>\ 

dx i2+ dx 22 dx 32 )‘ ndA ' 


and <j> satisfies a differential equation V 2 tf> =0, which can be very useful. 

If we are dealing with gravitation the potential due to a single particle 
is readily computed. This computation is used to compute the potential due 
to a body of finite density at a point (x 1 , x 2 , x 3 ) outside the body. By integrat¬ 
ing over the volume containing the body we obtain 


<f>(x\ x 2 , x 3 ) = G HI p(y\y 2 , y 3 ) 

x [(x 1 -y 1 ) 2 + (x 2 -y 2 ) 2 + (x 3 -y 3 ) 2 ] " l ' 2 dy 1 dy 2 dy 3 
= G HI ( p/r)dy l dy 2 dy 3 . 


The force due to the first body on another such body is given by integrating 
over the second body; 


Sec. 6.6 • The Generalized Stokes’ Theorem 


145 


In the case in which one has a number, n , of rigid spherical bodies in 
each of which the density is a function of the distance from the center, then 
the problem of mutual gravitational attraction is the same as that for n 
particles. This means that one can readily eliminate the expressions for the 
potentials or forces and obtain a system of simultaneous differential equa¬ 
tions on the position coordinates. In the more general case, the problem is 
resolved into determining the potential function from the distribution of 
matter. The forces associated with the potential function then determine the 
movement of the material. We have ignored the finite time of propagation of 
the gravitational field. 

The problems of electrostatics appear at first glance to be similar to those 
of gravity, but in fact the situation is more complicated. One must deal with 
the possibility of surface charge as well as a volumetric charge. In addition, 
one may have “polarization.” An uncharged body is the result of a very large 
number of negative charges canceling the same number of positive charges. 
A "‘charge” is usually associated with a relatively small difference between 
these numbers. However, a small spatial displacement of a much larger 
number of charges can also produce a field. This is referred to as polarization 
and occurs in a substance subject to an electric intensity field. The magnetic 
field has a similar but even more complicated notion of “magnetization.” One 
aspect of these complications is that the expression for the potential may not 
be unique. 

The gradual accumulation of information that finally led to the unified 
theory of electricity and magnetism occurred over two centuries and involved 
experimental investigations by Coulomb, Oersted, Ohm, Ampere, Gauss, 
and Faraday. Our interest is in the use of geometric integration to express 
the experimental results and in the use of Stokes’ theorem to obtain partial 
differential equations. We present a simplified development to indicate these 
aspects. 

One deals with charges and currents in massive bodies at rest. There is 
associated with the electric charges a field of force E, the electric intensity, 
which yields the force on a unit test charge. However, E has been modified by 
electrostatic polarization from another field, Z)=eE, called the electric flux 
density. It is reasonable to consider e a constant. Similarly there is a magnetic 
intensity field H and a magnetic flux field B subject to a matrix relation 
E = /iH. It is a definite simplification to assume that /i is just matrix multi¬ 
plication by a constant. 

The “flux of force” concept associates with a field of force a “flow” along 
“lines of force” so that the flux across an element of area da with area of dA is 
F • n dA. If the field F represents an inverse square law, say, of repulsion due to 
a charge distribution q , then if S is the boundary of the region one has the 
original Gauss result. 
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This yields (4 ti)" 1 div F = q. One has that D and the charge density p are 
related by (4rc) -1 div D = p. This yields for the current J 




or 


[ 


div J + (4 tc)' 



Ampere investigated the magnetic field associated with a current. In 
particular one can express the work done in moving a unit magnetic pole 
around a closed curve <£ in terms of J, the current flow through the bounded 
surface ©: 


H<fcc=jJ J ndA 


Stokes’ theorem then yields curl H = J , but since the div (curlH) = 0, this 
cannot be correct. Maxwell added the term (\/4n)(dD/dt) and the correct 
equation is 

1 dD 

curl H = J+-— t- . 

471 dt 

Faraday investigated the effect of a changing magnetic field on a circuit 
and showed that a voltage was induced in it according to the relation 


l Edx --r,\i BndA - 


If the circuit is at rest this yields 

1 IT 88 

curl E = —— . 

dt 

We can use the assumptions pH = B, D = eE for p and s constants to express 
the earlier relation as 

i o 1 VE 

curl B = fiJ H—j -r~ 9 
C dt 


where c 2 = 47t/^e. 
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If one uses curl E =grad (div £)-V 2 £, one can eliminate B to obtain 


L c 2 dt 2 “ 



grad p. 


One also has a relationship div B = 0 and a similar argument to the above 
yields 

l d 2 B 

V 2 £ —x —=- = —\x curl J. 
c 2 dt 2 

One can consider J and p as given as in radiation problems, or J and p may 
be determined in terms of £ and B by the substance. For example, inside a 
conductor one would have J = <x£ and p=0. 

In order to handle the differential equations, the variations in the force 
vectors across boundaries between different substances must be known. 
One has that the tangential components of £ and H are continuous across 
such a boundary and the normal components of D and B are continuous. 

The people who developed the theory of electricity and magnetism 
apparently were motivated by a purely intellectual drive to understand. There 
is a gap in time between the development of the theory and its application. 
However, the application has had an enormous cultural effect. 

For further discussion of the material in this section, see Abraham and 
Becker, (1) Bergmann, (2) Mason and Weaver,* 16) and Whittaker. (24) 


6.7. The Calculus of Variations 

In addition to the geometric constructions of Gauss’ and Stokes’ 
theorems the calculus of variations also yields problems in differential 
equations. This area of mathematics arose from the solution of the brachisto¬ 
chrone problem by the Bernoulli in the eighteenth century. 

Suppose we have an object at a point 0 in a vertical plane. This object 
can be considered to be a bead that will slide without friction on a wire in the 
plane to a lower point P 0 . The problem is to find the curve for the wire such 
that the time of descent will be a minimum. 

One takes a coordinate system with origin at 0, x axis horizontal, y 
positive downward, and with P 0 having coordinates (x 0 , y 0 )- Let U* y) be any 
point on the arc. The kinetic energy of the object will equal the decline in 
potential energy, 






s «r 


148 


Chap. 6 • Natural Philosophy 


I IB *1 II 


(•< 


It# 


«i* 

K,< 


If / is the derivative of y with respect to x, this yields 

u+/ 2 )(^) 2 =^y- 

Thus, 


1 

(V 71 



dx = dt> 


and the time of descent is 



This is a problem of the following type. One has a function of three 
variables F(x, y, /). We must choose a function y with given values at 0 and 
x 0 that will minimize the integral 


/= I* °F(x, y, y)dx. 

Jo 

Suppose y is a solution to this problem and Sy is any function with a con¬ 
tinuous derivative that is zero at 0 and x 0 . Let 


/(e) = f F(x, y + e Sy, yf + e Sy')dx. 

Jo 

Then /(e) must have a minimum for e = 0 and dl/de = 0 at e = 0. Formally we 
have 





This last equation is satisfied by all differentiable 5y that satisfy the end 
conditions. If y is such that dF/dy—ld/dx^dF/dy) is continuous as a function 
of x, this is only possible if 

d l _ A 

dy dx \dy'J 

This is called the Euler equation for the problem. If d 2 F/dy' 2 is not zero, this 
is a second-order differential equation, which is a necessary condition for the 
solution to the problem. The theory investigates sufficient conditions. 
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In our example we can take 

F = [(l+/ 2 )/y] 1/2 . 

One can show that the Euler equation for this problem implies 


2 y” 


+ - = 0. 


i+/ 2 y 

By multiplying by /, this equation can be integrated to yield 

(l+/ 2 )y = 2fl. 


This first-order differential equation can be integrated and the solution, which 
at x = 0 has the value y = 0, has the parametric form 

y = a (I —cos 0), x = a(0 — sin 0). 

One can also show that 0 = (g/a) l,2 t. This curve is also called the “cycloid. 1 ’ 
The quantity a must be determined so that the “brachistochrone” goes 
through (x 0 , y 0 )- 

The calculus of variations was applicable to a considerable number of 
interesting geometric and physical problems. It is still important in many 
practical situations such as optimal control. It structured the Lagrangian 
and Hamiltonian formulations of dynamics. 

From the mathematical point of view its importance lay in the fact 
that it dealt with sets of functions rather than, say, sets of numbers. We have 
minimum problems in which one chooses the best function rather than the 
most appropriate value of a variable. This led to Volterra’s theory of func¬ 
tions of curves, which was the predecessor of the modern theory of function 
spaces and abstract topological spaces. 

For further discussion of the material in this section, see Bliss. (3) 


6.8. Dynamics 

Newton’s dynamics was formulated in terms of a Galilean system of 
coordinates. The spatial coordinates are therefore Euclidean, which is quite 
restrictive, and we will discuss the formulation of mechanics due to Lagrange, 
which permits a much more general choice of spatial coordinates. 

We begin with a Newtonian formulation. In order to deal with an exten¬ 
sive body, one must consider it to be a collection of particles. We have for the 
Euclidean coordinates a system of differential equations m^=Fj, i= 1,2, 3, 
J=1,...,N. It is convenient to consider a single superscript with range 
| = L---, 3 N. In the case of a rigid body there are many relations between the 
coordinates of the particles that always hold. The forces are such that these 
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relations appear as integrals of these equations. If we have R such relations 
between the coordinates, we can make a change of variables so that R of the 
new coordinates are constants corresponding to these R restraints, and there 
are n new variables q 1 , ...,q n with n = 3N—R. However, there may be other 
relations between the new variables that do not correspond to a functional 
relation between coordinates, for example, a relation between the differ¬ 
entials of these variables such as A ia dq*=0 or a kinetic relation such as 
A i <,4 , =0 (see Whittaker' 23 *). A relation between differentials, A ia dq* = 0, 

corresponds to a functional relation F(q l . q")=c only if the A tJ values are 

proportional to dF/dx J . A relation that is not equivalent to a functional rela¬ 
tion is called nonholonomic, and a mechanical system that is not subject to 
any nonholonomic relation is termed holonomic. 

The forces F t at a point can be represented relative to the differential 
vector space of the ( dx ‘) as corresponding to a linear functional with value 
Fjdx 1 . If we perform a theoretical displacement indicated by the operator 
<5, W(S)=F a Sx a is the “virtual work” corresponding to the “virtual dis¬ 
placement.” 

The Newtonian equations m;x' = F, can be transformed into a form that 
is suitable for a change of variable. One introduces the kinetic energy, 
T=\m a x x2 , for which 


rttix' 




d_T_ 
dx'' 


where the partial derivatives correspond to T as a function of x' and x‘ and 
consequently dT/dx‘=m pc' and dT/dx= 0. A history of the mechanical 
system corresponds to a curve, (E, x'=x'(f) in the 3N-dimensionaI x space. 
The original equation F,—mjX'=0 can now be written 


F,+ 


<9T 

dx‘ 



We introduce a “virtual displacement” of (E in the manner of the calculus 
of variations, i.e., with zero displacement at the terminal points. If we multiply 
the above equation by Sx‘ and sum over i and integrate along the curve (E, 
we get two integral terms whose sum is zero. One of these, I F a Sx a dt = 
S W(S) dt, has an invariant integrand under changes of the space variables. 
(The parameter t is also the time and plays a special role.) The other term is 




by the usual argument of the calculus of variations, in which one integrates 
the second term by parts using (d/dt)Sx , ‘ = 5(d/dt)x ,, =Sx l ‘ and the fact that 
6x“=0 at the terminal points of (E. Thus, our original Newtonian equations 
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mpc' = F‘ imply 

J mS)dt + 5 | Tdt= 0, 

which is invariant under changes in the spatial or kinetic variables, and the 
calculus of variations will yield the converse result. 

We can use the restraints that can be expressed in terms of the coord¬ 
inates and time to introduce a new system of spatial coordinates, R of which 
are constants for the motion. The remaining variables q 1 , ...,q n now deter¬ 
mine an n-dimensional subspace of the original 3/V-dimensional manifold 
by equations 

x‘=x‘(q l , t ). 

(The time variable t is the same before and after the change of coordinates.) 
The curve (E is a path in this subspace so that there exists a preimage (E* in the 
q space. The virtual displacements that satisfy the restraint restrictions are 
also in this subspace and hence correspond to virtual displacements of (E* 
in the q space. In our original variation result, we can make the change of 
variables from the 3 N x‘ and suppress the new variables, which are constant 
because of the restraints. The result is a variation statement on (E*, 

£ HWr-l-^ Tdt=0. 

For further discussion of the material in this section, see Whittaker.' 23 * 

If there are no nonholonomic restraints, this is now just a calculus of 
variations problem on the q l , ...,q". Since it is the q‘ as functions of the time 
that are usually desired, this represents an elimination of the awkward 
restraint relations. If one has nonholonomic restraints, for example, A^dx* = 0, 
these restraints are transformed by the change of variables into another form, 
for example, B ia Sq*=0 , since one can then consider the variation of the 
restraint variables as zero. The variation problem can then be considered as 
an extremal problem in the calculus of variations subject to auxiliary re¬ 
straints. In any case we have a formulation of mechanics of great technical 
value that permits transformations of the q variables. 

The value of this procedure lies in the fact that both T and Wf<5) can be 
expressed in terms of the q l ,..., q". Thus, 



and 


W(5) = F a 5x*=F'—, Sq» =(say) Q p SqK 
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(The time is not varied in the virtual displacements.) In practice one would try 
to express T and IT(<5) directly in terms of the q‘, as for example in the case of 
a pendulum. There is a special case in which W(<5) can be expressed as the 
variation of a potential function, i.e., W(6)= —dW=(dWld(f)5q ! ‘, in which 
case the variation problem can be expressed as 

^ J(-fT+TV/r = °. 

Another special case is that in which the “generalized forces,” the Q„ are 
linear in the q '; for example, Q, = Q i0 + a,v7 a - This situation can be interpreted 
in the case where the a,- are constants. If the matrix ( ajj) is symmetric and 
negative definite, one is dealing with a dissipation of energy, as, for example, 
the dissipation due to resistance in an electric circuit. On the other hand a 
constant magnetic field acting on a moving charged particle will yield a 
situation in which the a u are antisymmetric. 

When one has a potential function, W, we have the Lagrangian L=T— W 
and the Lagrangian equations from the variational principle 



This is a system of n second-order differential equations on the n functions 
q'U), ...,q"(t)- There is considerable theoretical interest in an alternate 
formulation in terms of the momenta, Pi=dL/dq\ regarded as functions of the 
time. These constitute n linear equations on the q‘ and thus can be solved to 
express the <j, as linear combinations of the p,. Consider 

W(p, q)=p a q’-L{q, q). 

Here the q‘ on the righthand side are considered to be functions of p, and 
qj. Consequently 

dH _ dq“ _dL_dLd^_ _dL_ 
d?~ p *dj~8j dq a dq‘~ dq‘ 
dH * dq* dL dq* 

nfr^wrwuFr*- 

Thus, the original system of differential equations is equivalent to the 2 n 
equations 

dH ., dH 
P '~ dq'' q ~dp' 

in which the right-hand side is expressed in terms of the p, and q t and the 
system is explicitly solved for the derivatives. 
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The exploration of the use of the Hamiltonian is of course part of the 
theory of classical dynamics. This includes the role of the Hamilton-Jacobi 
equation, which is a first-order partial differential equation in the form 

H(dS/dq\ q‘)=0 on an unknown function S(q l . q n ). One important aspect 

is the study of canonical transformations, i.e., those transformations on the 
2n-dimensional p, q space that still yield equivalent dynamic problems. 


6.9. Manifolds 

It is natural to describe the behavior of substances in terms of a rather 
general three-dimensional geometry. However, the evolution of the theory of 
dynamics in the hands of Euler, Lagrange, Hamilton, and Jacobi proceeded 
in a much more general w-dimensional framework, paralleling the develop¬ 
ment of appropriate mathematical tools such as the theory of differential 
equations and the calculus of variations. In the nineteenth century the 
analytic formulation of geometric ideas crystallized into the concept of 
invariant structures associated with permissible transformation groups, 
leading to Riemannian and more general geometries. These ideas are 
particularly appropriate for dynamics and eventually provided remarkable 
flexibility in the formulation of modern theories such as relativity. 

Let us consider in an informal way the modern notion of a differentiable 
manifold. A differential manifold of n dimensions is a point set or “space" 
that is smooth enough so that at each point one can define a “differential 
vector space.” The problem is to set this up in an analytic manner. There are 
many formal procedures for doing so that are logically satisfactory and pro¬ 
vide little insight. Let us describe the setup informally. Consider the manifold 
as being divided into a number of overlapping patches. Each patch is in a 
one-to-one correspondence with an open unit cube in 11 -dimensional space. 
Since each point P 0 is in at least one patch, each P 0 has at least one set of 
coordinates x',.... x". If P 0 is in the overlap of two patches and has two sets 
of coordinates x 1 ,..., x" and y\ then there is an open neighborhood of 

x 1 ,.... x" that is in a one-to-one correspondence with an open neighborhood 

of given by n functions, /=/‘(x l .x"), which are indefinitely 

differentiable. Thus, if the assignment of coordinates yields more than one 
set, these different assignments have a local relationship that can be differ¬ 
entiated any number of times. We refer to the mapping of a patch of the 
manifold onto the unit n cube as a “chart” and the whole procedure including 
the overlap functions as an “atlas.” 

We suppose one such atlas as given. Now consider another such atlas 
such that if a point P on the manifold has coordinates x 1 ,..., x” from a chart 
of the first atlas and coordinates y 1 , ...,y" from a chart of the second atlas. 
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then again we can find an open neighborhood of x 1 , ...,x" that is mapped 
into an open neighborhood of /, ...,y" by this relation in terms of a set of 
n indefinitely differentiable functions, and the inverse mapping has the 
same property. This relationship between two such atlases is symmetric and 
transitive and one can consider the collection of atlases that are equivalent 
to the given one by this relation. 

Such a collection of atlases is a global structure on our manifold that 
can be used to specify a “geometry.” For it is clear that any one-to-one 
mapping of the basic point set of the manifold onto itself will take an atlas 
into an atlas. The group of transformations that determines the geometry is 
the set of such mappings that preserves the collection. It is intuitively clear 
that the local geometric properties of the manifold are those that are pre¬ 
served by passing from one chart to another, i.e., those preserved by the 
/' mappings given above. 

These charting constructions imply the existence of a “differential vector 
space at each point P 0 of the manifold. In other words, the geometry of 
manifolds includes such a space, since this space can be shown to be invariant 
under the group transformations. The invariance is for the space as an entity, 
not for the individual vectors. To show this we consider a function F(P) of 
the points of the manifold. For any chart that contains P 0 , there is a neighbor¬ 
hood of the coordinates x},, .... xj such that for points P with coordinates in 
this neighborhood x‘,...,x", F(P)=F(x\ ...,x"). Suppose F is indefinitely 
differentiable. Then at P 0 , dF = (SF/dx*) dx a for any differential vector 
(dx\ ..., dx"). If we make the change of variable y > =f‘(x 1 ,..., x"), the 
differential vector is transformed by the equation dy‘ = ai dx *, where a‘=8y‘/ 
dx*. Thus, the space of such differential vectors is an affine space. 

The analytic expressions for geometric notions in this affine space 
involve numerical arrays called “tensors.” When the reference coordinates 
are changed as in the preceding paragraph, these arrays are transformed by 
means of the a)=dy‘/dx J and the A^dxt/dy*. Thus, the vector (dx 1 ,..., dx") 
becomes (dy 1 , ...,dy"), where dy i =a? !l dx a . This latter relation also implies 
dx' = A‘ x dy, which will specify the transformation of the coefficients of 
differential forms. Thus, the differential form c xfi dx* dx" becomes d x „ df d / 
for c'ij=c^A^A^j. The components of tensors are usually expressed with sub¬ 
scripts and superscripts. For example, the component of a tensor t may be 
expressed as t' Jk , which indicates that the transformation rule is t‘ jk '= 
ait^Al 

The relations expressible in terms of this infinitesmal affine geometry are 
invariant under the larger set of coordinate transformations. One can form¬ 
ulate the problem of establishing global consequences of local relations. 
Less generally, integration concepts can be introduced in the form of integ¬ 
rals over k-dimensional submanifolds of our original n-dimensional mani¬ 
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fold. These integrals have integrands which are fcth order exterior forms over 
the n-dimensional differential vector space. The k-dimensional submanifold 
must be divided into pieces each of which corresponds to a /(-dimensional 
subset of the n cube of a specific chart. For each such piece the numerical 
interpretation of the integral is clear. However, the integrand must have a 
certain tensorial significance if the result is to be independent of the choice 
of the chart. Thus, the integral 



P xfiy dx* dx fi dx y 


is a scalar invariant if the P ijk are the components of a covariant tensor. 

For further discussion of the material in this section, see Helgason ,u> 
and Whitney.' 221 


6.10. The Weyl Connection 

Tensor invariance corresponds to the least specific structures on the 
manifold. An additional type of structure is based on a construction due to 
H. Weyl that compares the differential vector space at a point P with that at 
any “infinitesimally displaced” point Q. The vector space E p is related to the 
equivalent space E Q by a linear transformation G, which depends on P@. 
Thus, if (dx',...,dx") is a differential vector at P, then G(dx l ,...,dx n ) is a 
differential vector at Q, (x 1 +<5x 1 ,..., x" + <5x"). It is reasonable to assume that 
G can be described by a matrix 


G={5;+r^x*}=/+r a <5x‘, 

where / and T a are transformations given by the appropriate matrices. 

One must also specify the situation relative to overlap. Suppose at P 
we have another chart and coordinate system that assigns to P the coord¬ 
inates y 1 ,...,y" and Q the coordinates y 1 +5y\ ...,y"+<5y". Then y‘ = 
/‘(x 1 , ...,x") and dy‘=a x dx* for ai=d/'/dxf. We use the obvious notation 
dy=adx , dx=A dy for /fj=dx'/<V- Let the bar indicate a reference to the 
point Q. Then dy=adx = (a + 5a)dx = (a + A a 6x*)dx, where A* is the matrix 
d 2 y‘/dx i dx k where i,j=l,..., n. Thus, 

dy =(a + A a 6x‘)dx=(a + A a Sx a )G dx=(a + A a <5x a X/ + T, 8x“)dx 

=(a + A a Sx a ){I + Tp Sx f )A dy=dy+ (A a <5x*)A 5y+a(Tp Sx 0 )A dy H-. 

If we use tensor rather than matrix notation this becomes 





156 


Chap. 6 • Natural Philosophy 


dy i =dy i + 


d 2 y l 

dx p x° 


<A° f 5f dy f + a\V' po A^A a 0 d/ 


=dy ‘ + {d^? +aT ^) A ^ dyr dyfi ' 


and if the prime indicates the connection for the y 1 any y*, 
rtk = (d^ +a ^) A " Al 

Of course, the change of charts relative to a change of atlas yields the same 
relation. One can show that if one makes a further change of coordinates from 
y to z, one gets the same relation between the T'* for the z' and ? and the 
original T}* for the x' andx' as one would obtain if one made a direct change 
from x to z, eliminating the intermediate transformation to the y coordinates. 
Thus, if the T}* are given in one coordinate system, they are uniquely deter¬ 
mined for any other coordinate system and have a unique transformation 
relationship relative to change of charts. The can therefore be introduced 
as a further geometric structure beyond the tensor relations. 

Because of the second-order partial derivative term in the transforma¬ 
tion formula, the Tj* are not components of a tensor. However, for a change 
of coordinates that is given locally by a linear matrix with constant elements, 
the transformation rule is a tensor rule. In particular if one can find a coord¬ 
inate system for which the r‘ jk are zero, then the space has a local affine 
character. In general, if we are given a coordinate system x 1 , ...,x n with a 
corresponding set of T' t , we can find another coordinate system y 1 , ...,y" 
with T£=0 if we can solve the system of equations 


These equations imply 


n*=r 


r ,. *y 

dx* Jk dx J cx k 


= 0 . 


kj 


and 


ar 


^-^+n^-r^=o. 


*rj» 


The relation between the differential vector space of “infinitesimally 
close points implies a relation between the differential vector spaces of 
points that can be joined by an appropriate curve (E. Such a curve would 
consist of a finite sequence of adjoined arcs (E„ each of which is in the patch 
of the manifold associated with a chart. The relation in general, then, is 
implied by the relations between the vector spaces of the endpoints of such a 
chart. 

For the coordinate system associated with the chart a specific arc can be 
represented by the parametric equations x^X^t) for OsSfsgl. To avoid 
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confusion let us represent differential vectors by single letters, i.e., by 
and if one such vector is chosen at every point on the curve 
P(t), we can denote it 8(t)=(8 l (t),8 2 (t\.. .,<5"(f)). Suppose we consider <5(f) as 
being moved along the curve in accordance with the connection, i.e., 

dX p 

at 


or 


—(< 50 =. 

dr ’ dt 


This is a system of differential equations and a vector associated with the 
solution (8 l (t\ 8 2 (t\ ..., <5"(0) can be considered to be the transported vector 
of the corresponding initial conditions ( 8q , at the initial point of the 

arc. 

For further discussion of the material in this section, see Brillouin, (5) 
Helgason, (11> and Whitney. <22) 


6.11. The Riemannian Metric 

The invariance of tensor relations corresponds to the affine character of 
the differential vector space, and the connection T** represents additional 
geometric structure. The affine character of a linear space can be specialized 
into a Euclidean space by designating a tensor g {j for each point on the mani¬ 
fold. Such a tensor, g determines a connection r i jk by the condition that the 
transport operation preserves the inner product relation. Let y\ and £ denote 
two differential vectors at a point x 1 , ...,x" and consider the transport of 
g lft r}^ p to the point x l + 8\ <5x" + <5". Then 

o = S(g^0 )=+ gjfi? 

=^ SxY?+Sx» n't* + g^rl dx» 

=(^+9^+g„r> y ^ l, 5x\ 

Thus the connection is determined by the relations 

jji + g„fli+g^rij =o • 

The g it and the T‘ t are considered to be symmetric in the subscripts. If this 
equation is written three times with cyclic permutations of the subscript, 










158 


Chap. 6 • Natural Philosophy 


that is, first for i,j , k, then fory, k, i, and then for k, i,j, and the first equation 
is subtracted from the sum of the other two, one obtains 

\dx l + dx* dx k ) + g * kr!j - 0 - 

Since we assume that the matrix of the g t j is not singular, this determines 
the connection T}*. The connection associated with the “metric tensor” 
g i} is said to correspond to a “parallel displacement” of the differential vector. 

Along an arc located within a given chart, the parallel displacement of a 
vector is given by 




This equation is valid for any choice of the parameter t. Since g a0 dx* dx" is 
now an invariant for the geometry if the arc is such that 

ds_( dX'dX’V 12 
dt \ gtf dt dt ) 

is not zero along the curve, we can introduce s as a parameter and consider 
the curves obtained by displacing the tangent vector in its own direction. 
This would yield the system of differential equations obtained by replacing 
the 3, by dx'/ds and t by s, 

d 2 x l _ dx* dx" 
ds 2 ds ds 


The system of curves that satisfy these equations are the “geodesics.” This 
refers to the fact that such a curve is an extremal for the integral 

J ds -f P'-j/ Td <- 

The Euler equations for this variation, 

±(ST\_ST 
dt [dx^dx* ’ 

becomes 


d 1 1 dg 

dt (g^Px *) 112 ^ ) ~(g^x») 1 ' 2 W 


x*x fi . 


Since dt/ds — (g (xfi x a x fi ) 1/2 , we can multiply these equations by dt/ds and 
obtain 
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This yields 

d 2 x* = 1 fdgdg ia dg ifi \ dx g dx fi 

** ,a ds 2 2\dx‘ dx fi dx a ) ds ds ’ 

and this is equivalent to the geodesic equations. 

An extremal condition is, of course, independent of the choice of the 
coordinates. The principles of general relativity requires that the laws of 
physics be independent of the choice of the coordinates used to express space 
and time. Thus, in modern theories of motion the space-time continuum is 
given a Riemannian geometric structure and the motion of an “infinitesimal 
particle” is described as moving on a geodesic. This means that the second- 
order Newtonian equations are replaced by the geodesic equations, but the 
Newtonian equations must be an excellent approximation when the speed 
of the particle is small relative to the speed of light. Instead of “forces” in 
the new equations, we have T terms with geometric significance. This has 
yielded a more precise description of gravitation in the solar system. 

But there are other forces besides gravitation, and efforts have been 
made to describe them by using the more general notion of a connection or T 
concept. For if the are given, there will not be in general a tensor g^ 
that satisfies the relation 

+ 9iaTkj + djaFki = 

since one can readily obtain necessary conditions on the r' jk that are not 
always fulfilled. Those developments are discussed in Brillouin, (5) Lanczos, 
and Weyl. (21) 

Thus, the development of this analysis has produced subtle and beautiful 
intellectual concepts. For further discussion of the material in this section, 
see Eisenhart. (7) 

For further discussion of the material in this chapter, consult Heath, (9) 
Heitler, (10) Kellogg, (12) Lamb, (14) Van der Waerden, (20) and Whittaker. <23) 


Exercises 

6.1. By means of summation formulas, show that the area under y=x" between 0 
and a is a" +1 /(n+1). 

6.2. The "principle of Cavalicri” is trivial in terms of modern theories of integration. 
But it is interesting to see what is required to validate the argument given by Cavalieri 
(see Smith, 0 9) p. 605), e.g., in the case of areas. 

6.3. Compare the geometrical arguments of Section Two of Book One of Newton’s 
Principia i6) with the corresponding modern analytic arguments. 

6.4. How does 

±=l-2 + 3-4+... 
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arise by interchanging limits? Evaluate the partial sums of the power series 1 —2x + 3x 2 
-4x 3 +; • What happens as x approaches 1 ? What is the effect of introducing Ccsaro 
summation? In regard to multiplying series term by term, what is the relationship 
between the square of a partial sum for 1/(1 +x) and a remainder and a partial sum for 
1/(1 +x) 2 and its remainder? 

6.5. Explore the geometric theory of the shape of gear teeth. 

6.6. For a curve in three dimensions given in parametric form, determine the 
oscillating plane, the curvature, and the normal. 

6.7. Let i?‘(x\ x 2 , x 3 , t) denote the velocity of the substance at the point (x 1 . x 2 , x 3 ) 

at time t. We take a fixed time t 0 as a reference time. The point in the substance that at 
'o was at xi,x£,xj5 is at time t at the point x 1 , x 2 , x 3 . We have then a transformation on 
the spatial coordinates for each time value t given by which is the 

solution of the system of differential equations 

dx 1 

— =t?'(x\x 2 ,x 3 ,f), 
dt 

which at t = t 0 takes on the value xi, Xq, Xq. How are the partial derivatives dx'/dxtf 
determined? How is an integral 

jj (f l dx 2 dx 3 +/ 2 dx 3 dx l +/ dx 1 dx 2 ) 

expressed in terms of xj, Xo, Xq? 

6.8. State the implicit function theorem for 3 or n variables. What is the condition 
for “functional dependence’'? What is the formula for a change of independent variables 
in a multiple definite integral? 

6.9. What are “existence theorems" for systems of differential equations? What are 
“uniqueness theorems"? Give examples of each type. 

6.10. Show that for an incompressible substance Dp/Dt — 0. 

6.11. At one time, heat Q was considered a fluid and the temperature T in a homog¬ 
eneous substance was assumed to be proportional to the local density of heat fluid. Thus, 
Q was an extensive quantity related to the intensive quantity Tby 

e-jjjcT*r, 

* 

where the constant C is the “specific heat." The rate of flow of Q was considered to be 
proportional to the negative temperature gradient. If 9$ is the surface bounding the 
region 91 and n is as usual the outward normal, then this assumption can be written 

f=jj/c<grad T) n dA, 

» 

where k is the “heat conductivity ” Since this equation is valid for an arbitrary region 91 
these assumptions lead to the “equation of heat": 

d 2 T 2 dT 

dx l2 + dx 22 + dx 32 Q dt 

with a 2 = k/C. 

6.12. A fc-dimensional subspace in n-dimensional space is given by n equations 

x^x'fr 1 .r*). If in the differential form L/**.* ...dx* k we substitute dx' = 
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(rx‘/<Vk/r a and use the manipulation rules, what is the integrand for integration relative 
to t l . r k 7 

6.13. Prove Stokes’ and Gauss’ theorems. How are the rules for differentiating 
"differential forms" applicable? (See Flanders/ 8 *) 

6.14. How is the area of a patch of surface / J dA expressed in terms of the vectors 
R and S? How is this related to the Archimedean definition of area of a surface? 

6.15. Show that a vector field F can be expressed in the form G + H, where curl G =0 
anddiv H = 0. Prove that if curl G = 0, then G = grad 4> for a function ofx 1 ,x 2 , x 3 . Prove 
that if div II = 0, then there is a vector field A such that curl A = H. (This is a very standard 
th'eorem that is used in practically every theory. It is suggested that if the student is not 
familiar with it, he try to establish it on his own before looking it up in one of the standard 
books.) 

6.16. Obtain Stokes' theorem and Gauss' theorem from the formula for the 
generalized Stokes' theorem. Also show that d{dco)= 0. 

6.17. Obtain the potential function of a thin spherical shell and show that for a 
point outside the shell it is the same as that of a particle of the same mass located at the 
center. What is the potential function inside the shell and what is the significance of this 
result? What is the potential function of a rigid sphere with density proportional to the 
distance to the center. What is the energy of a configuration of two such bodies relative 
to when they are at an infinite distance? What is the attractive force between them ? 

6.18. Consider the contribution to the potential <t> at the point P(x*,x 2 ,x 3 ) of a 
number of charges, e h in a small region, 91 around a point Q. The diameter of 91 is small 
relative to r = PQ. Suppose is at P„ /, is the distance P { Q and /_PiQP=0 r Then 

r, = dist(P,P) = (r 2 -I- If —2 /,-r cos 0,) 1/2 


=I e j,- (l e > , ‘ cos + • • • • 

If Zi is the vector QP { and j is a unit vector in the direction 0P, then !<?,/, cos 0 { =(£e f z f ) j 
= P■ /for the polarization P = Ze,z,. Thus, we have a contribution to the potential in the 
form 




(a) What is the form of the higher terms in the expression for A</>? 

(b) How does polarization contribute to potential on the surface of a body? 

Discuss the ambiguity in the expression for 0 when both volumetric and surface 

terms are used. 

(c) Suppose the charges occurred in matched pairs, e h —e h with displacement 
vector r between them. What is the formula for the contribution of such “dipoles" to 
the potential? 

6.19. Prove the Gauss result for a repulsive charge distribution. 

6.20. The argument in the text used to derive Euler's equation in the calculus of 
variations assumes that one can differentiate {dF/dv' Xx, >\ /) and thus that the second 
derivative of the unknown function exists. On the other hand, if y and / are continuous, 
then usually we can integrate dF/dy. Let G(x)= S x 0 (dF/dy)dx + C. Then integration by 
parts yields 


0 = 



Sy+^dAdx =£°[jp — G(x)J<5y dx+CSy 


X 0 

0 
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Now 8y=0 at x=0 and x = x 0 and thus /'* 8? dx = 0 for all permissible variations 8y. 
We have, then, instead of Euler’s equation the result that 


SF , , r« dF . , 
HP'*-*"-)' Ty iX + l 


for some constant k. What are sufficient conditions that insure that / has a continuous 
derivative and under what conditions does y have a right and left derivative ? 

6.21. When one can as a practical matter assume the existence of a potential energy 
function of configuration, the Lagrangian variation theory can be applied to a nonrigid 
body. Consider, for example, a vibrating stretched elastic string satisfying Hooke’s 
law. Consider a piece of string of length A 0 when it is subject to no tension. If this piece is 
stretched to the length X=X 0 +n, then Hooke’s law states that there is a constant h 
such that the tension required is F=hfi/X 0 . The work done in stretching this piece is 
IV = hfi 2 /2A 0 =h(k— A 0 ) 2 /2A 0 . 

Suppose the original string has unstretched length ! 0 and is stretched to a length / 
and laid along the x axis from (0.0) to (f, 0). A vibration of this stretched string can be 
described by two functions u(x,/), tfx.t) with 0s=xs£/, 0«/, which are such that the 
point (x, 0) on this initially stretched string is at the point (x + u,r) at the time t. We 
assume that u(x,0), i4x0), cu(x,0)/dt, <Mx,0)/8t are given and that the endpoints are 
held fixed; i.e„ u(0, f) = u(0, t)=0, u(l, t)=v(l,t )=0 for t >0. The energy of configuration is 
obtained by considering the piece of string that in the initial stretched state lies between 
(x,0) and (x+<fx,0). This has unstretched length l 0 dx/l and at time t has length 
m + + V 2yn dx. We assume that the radical can be approximated by means of the 
formula (1 +a) 1,J = 1 + a/2, since u x and i>, are small. Then the configuration potential 
energy is 

w=h '-*r f [0 +u*) 2 +^-i]<**+k 

"0 Jo 


for a constant K. If M is the total mass of the string, the kinetic energy is 

T=yJ (t>? + uf)dx. 


Ifc 2 = M/-/o)/M/ 0 . the variation principle becomes 

0=5 I I ( 1,(2 + u ' 2 ~ 1 + + v l]}< lxdt ' 

and the Euler equations are 

c 2 u J .i=u„ and c 2 v xx =v„. 


The solutions must satisfy the boundary and initial conditions given above. 

6.22. Obtain the Euler equation for the brachistochrone and integrate it into the 
given parametric form. What is the geometric interpretation of the name "cycloid”? 
How can one determine a so that the curve goes through (x 0 , >’o)? 

6.23. Consider oxygen gas at atmospheric pressure and 0°C temperature. Show 
that the number of impacts of molecules during a microsecond on a square of side 1 /im 
on the container wall is about 2 billion. (Use the method in Chapter 1 of Mayer and 

Mayer. ( ‘ 1) ) . 

6.24. Show that if a one-to-one mapping of the basic point set of a manifold takes 
one atlas of a collection into another atlas of the same collection it takes every atlas of the 
collection into one of the collection. 
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6.25. Show that two changes of charts, say, from x coordinates to y coordinates 
and then from y to z yield the same relation between the r‘ 4 associated with x and the 
TJI associated with the z as a direct change from x to z. 

6.26. Determine the integrability conditions for the system 


r , 

ax* r * + 


d 2 y l 

dx j d)f 


= 0 . 


6.27. Consider the equation 

+ ff./w +g i.n y =<o 

for determining the rj* under the assumption that the are symmetric but without 
assuming symmetry for the T**. Show that under these circumstances the I"?; must be 
symmetric in the subscripts. Discuss the situation in which the g u are antisymmetric. 
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7.1. The Motion of Bodies 

In discussing the flight trainer we obtained a description of the motion 
of a rigid body in terms of the external forces acting on it. In general, however, 
bodies are not so rigid that relative motion between the parts can be com¬ 
pletely neglected. In many cases the general shape of the body is still retained, 
so that it seems desirable to distinguish a gross motion for the body as a whole 
and a relative motion for the parts. The distinction we will make between the 
two motions does have certain arbitrary aspects, but these are consequent 
on practical exigencies. 

Newton’s second law that action and reaction are equal has certain 
consequences for an arbitrary system of N particles. Let the coordinates of 
these particles be y‘j, i = 1,2,3 ,j= 1,.. N and let y } denote the vector displace¬ 
ment of the jth particle, and Fj the resultant force on the ;th particle. In a 
Galilean frame we have mjyj=Fj and consequently for the center of gravity 
Y=(Lm 3 y„)/M, we have 

A*y=I f,=f r . (d 

a 

In the summation, F K , forces between particles cancel by Newton’s second 
law and we need only sum over forces on the particles due to external 
aspects. 

We also have m J {y-Y)=Fj-m j F R /M and 

mj{yj-Y)x(yj-Y)= ( y, -Y)x(Fj- ntjF^M), 
which can be written 
d r 

J t x (Yj- l , )]=(y J - L) x F j -(yj-Y)m j F R /M. 
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If we sum over the particles, the last term contributes zero so that we have for 

L=Zm 0 ly t -Y)x(y a -Y) 


^=l(ya-Y)*F a =T. 


( 2 ) 


In the summation for T, the collinearity of action and reaction implies that 
again we need only add over forces on particles that have an external effect. 

When the relation (1) is applied to substance in general it shows that if 
the motion is not disruptive it must be possible to introduce an intensive 
quantity /, the “force per unit volume” corresponding to Newton’s third law. 
Let 91 be any region in the substance. The center of gravity of the material in 
91 is given by 


2l 21 


y corresponds to the motion of the material in 91 We can regard the integ¬ 
rals SSSp dV and SfSypdV as summations relative to a large number of 
small particles with dM = p dV and thus 

a JJJ 

This can also be expressed by the relation 

§t (pdV)= {%t) dV+p nP = {% +p div ) dv 


jdp 

U' 


d(pt> *) < d(pv 2 ) | d(pt; 3 ) 

dx 3 


'J dV= 


0 


dx 1 dx 2 
by the equation of continuity. Thus, we have 

M V Y= jjj fniJV and A%?-J]J pydK 

2i si 

The force acting on the substance in 91 is an extensive quantity with value 
MyY. The extensive nature can be obtained by considering the force on 
contiguous regions or just from the above formula for MY. It is then the 
integral of an intensive function / such that 


ill' dy -ill * jv 


(3) 


H 21 

Thus, f=py. We will return to this definition of / as an intensive function. 
We must also consider the precise significance of y. 
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With this definition of / we again have F R = SSSf dFand MY= F R for 
a body or substance of finite extent. We can consider Newton’s third law as 
expressed by the equation / dV=pydV and we have (f-F R p/M)dV= 
p(y— Y)dV and 

dV (y- T) x ( f-F R p/M)=^ \_p(y-Y) x(y-Y) dV]. 

Thus, summing over the space that contains the substance yields 

r= JIf (y ~ Y)xfdv= 7t L= 7tlSI 

If we sum only over a region 21, then Y is to be taken as center of gravity 
y a for the substance in 21 and the result is 

= {.(y-r*)xf\dV=j t JJJ Pl(y~Y v )x(y-Yn)]dV. 

21 

The quantity to be differentiated is L®, the angular momentum of the 
substance in 91 

We can now introduce a Cartesian frame centered at Y by means of an 
orthogonal matrix i 


y J T 


where x is the position vector in the new frame. Let Q be the spin vector 
determined by the relation di/dt = i(Q x). Then 

P[(y — y)x(y— Y)~\dV= JJJ p{ixx(x+fixx)}</F 
= ' j JJJ P( x *x)dV+ JJJ p[x x(Qxx)]dvj- 
= 'j/ +JJJ p[x x (Q x x)] dV |. 

Here /= if p(x x x)dV is the angular momentum in the new coordinate 
system. The second term can be expressed as a matrix transformation on the 
vector Q, i.e., 

JJJ p[x x (fi x x)] dV—JQ, 

where the matrix J has elements 


-JJJ 


px - xdV 
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and 


L=/(/ + Jf2). 

If we knew how to calculate /, we have the differential equation mY= F R 
for 7. However, in the general case there is an ambiguity relative to Q and the 
motion of the configuration as expressed in terms of x so that one does not 
have a practical way to determine Q and i. 

But to reap any advantage from the introduction of the orthogonal 
matrix i, we must suppose that we are dealing with a body or object in which 
the shape is essentially preserved. We have seen that in the case where the 
shape is actually preserved, that is, in the case of a rigid body, we can introduce 
an orthogonal matrix i so that if y describes the motion of a point on the 
body then x is a constant. Furthermore p is a function of x. We will interpret 
the statement that “the shape is essentially preserved” as corresponding to 
the fact that there exists an orthogonal matrix i for which the motion of 
point is expressed by an equation x = x 0 + w, where x 0 is a constant vector and 
u is small. We suppose that there is a reference situation in which u = 0 and 
that p 0 = PoUcb Xq, Xq) describes the density in this reference situation. 

The motion then is described by giving Y y i, and u as functions of 
x£, Xq, Xq. In the partition into particles, we use the reference situation. Thus 
we have 



PoX o dV o — 0 


and 



p 0 udV 0 = 0. 


We assume that there is a vector function f 0 such that Newton’s third law 
can be expressed p 0 ydV 0 =f 0 dV 0 . This yields 

MY=ljjf 0 dV 0 =F R , 

where of course the contributions to f 0 of inner forces can be ignored, and 
similarly we have 


T= JIf ^ y ~ Y)Xfo ^ dVo = 7tjj\ Po[(y-Y)x(y-Y)}dV l0 . 

Now we can use y—Y= i(x 0 + u), y-Y=i[u + £lx(x 0 + u)] and obtain 

L= JIf PoUy-mxiy-Y^V,o 

= i JJJ Po[(x 0 + u) x (Q x x 0 + u + Q x u)]dF 0 

= 1 JJJ Po[*o x (Q x x 0 )]d V Q + <%, *o,] = V# + ^), 
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where J 0 is calculated from the reference position, i.e., 

doij dV 0 + Sij pxo * Xq dV 0 . 

We can represent^ in the x frame by letting/ 0 = ig 0 . The torque relation then 
becomes 


wu 


{x 0 + u)xg 0 dV 


“(l +Qx ) (J « 


,Q ~h &). 


This suggests that the x frame be specified by the conditions MY=F R and 
So—(d/dt x )J 0 Q and that u be determined by the relation 


0o 


= Po | G r /M + QX ^ [u-ffix(x 0 + m)]|. 


where F R = iG R . The expression for u may appear complicated, but it is just 
the form Newton’s third law takes in a moving coordinate system. In many 
cases g 0 will depend on u and its partial derivatives relative to xj, xj, Xq so 
that the u equation can be regarded as a second-order partial differential 
equation on m(xq, Xq, Xq, 0- 

It is not clear mathematically that the functions u obtained will be small. 
If we multiply the above equation by i and integrate relative to dV 0y we will 
have 


If/ <’° ix ° +u)dv ° = j?(! HI 

SSSf 0 dV=F R ,p 0 does not depend on t, and /// p 0 x 0 dV 0 = 0. This yields 

i JJJ p 0 udV 0 =at + b. 


We can suppose that auxiliary conditions are imposed on u so that the vectors 
a and hare zero and thus JJJ p 0 u dV 0 = 0. This is consistent with u small but 
does not imply it. Nevertheless the critical question is to determine/„. 


7.2. The Stress Tensor 

Let us now consider the general case of a substance in a region with a 
Cartesian frame from a Galilean system. We wish to specify a vector function 
/ of the variables /, y 2 , y 3 , and t such that 

jjjfdy-F,-jjj m ydV 
m a 



is the resultant force on the substance in the region 91 for any that contains 
the substance. 

F r is composed of two types of forces. We have, for example, forces 
that are distributed through the bulk of the material and that can be expressed 
directly as integrals, 

MdV. 

An example is gravity. For a small region near the earth this can be expressed 
in terms of a constant vector g as g SS / P dV. In general, however, the force 
due to gravity can be expressed as 

///»« 

where g may depend on y 1 , y 2 , y 3 , and t. 

Another force that contributes to F R is the resultant of the forces that the 
rest of the substance exerts on the stuff in 91 through the boundary 93. If we 
can find a suitable integral expression for this force we will be able to apply 
Gauss’ theorem to obtain a volume integral. 

Consider an infinitesimal element da of surface with unit normal 
(ni,n 2 /i 3 ) determining an upper side for da. Our notion of contact indicates 
that the substance on the lower side of da presses on the substance on the 
upper side with a force F, which is proportional to the area of da and depends 
on the position y 1 , y 2 , y 3 , t and the components n u n 2 , n 3 of the normal, i.e., 
F=F(n,y)dA. 

There is a standard argument that we will sketch that shows that F is 
linear in the n t ,n 2 , n 3 , i.e., that 

F = dAi a a x0 n 0 

and Oij=ap. In general the a,, will be functions of y 1 , y 2 , y 3 , and t. 

Consider a point P 0 , yj, yj, yjj, and let a normal direction (n u n 2 , n 3 ) be 
chosen. For convenience we will assume n,->0. Let p be a small positive 
number and consider the plane / with equation 

n,* 1 + n 2 x 2 + n 3 x 3 =p 

in a coordinate system with origin at P 0 and axes parallel to the original. 
Let P h i= 1,2,3, denote the intercept of / with the x‘ axis. The intercept values 
are x‘=p/n,. The volume of the tetrahedron, PqPiP 2 P 3 , is p 3 / 6 n l n 2 n 3 and the 
area. A, of the triangle P\P 2 P 3 in the plane / is p 2 / 2 n 1 n 2 n 3 . The triangle 
P 0 P 2 P 3 the x 2 , x 3 plane has area A 1 = p 2 /2n 2 n 3 . Similarly we can define 
A 2 and A 3 as the areas of the triangles P 0 P 3 P, and P 0 PiP 2 , respectively. 
Thus, i4|=j4f!j. 
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Consider now the forces on the various faces of the tetrahedron P 0 P, P 2 P 3 
by the exterior substance. That on PiP 2 P 3 is —F(n)A. Now ij is the unit 
normal to the face with area A } and is directed into the tetrahedron. The 
corresponding force on this face is F(ij)Aj=F(ij)njA. Thus, the resultant of the 
forces on the faces of the tetrahedron is [ F(i fi )n fi —F(n)~\A. 

Now let be the resultant of the body forces on the tetrahedron, 

where V=pA/3 is the volume. Let X denote the position vector for the center 
of gravity of the tetrahedron. Then 

[F(«>, - F(n)]A = VpX-<j> V= P A(pX - <fi)/3 ■ 

If we cancel A and let p approach zero, we obtain F(n)=F(i 0 )n 0 = 
when a„j is defined by F(i J )=i I a OJ . 

Thus, F(n) is related to the normal vector n by the transformation 
F(n)=an, where a is the matrix (a,j). The symmetry of the matrix a is estab¬ 
lished by considering the torque equation for a small cube of side a of the 
substance centered at P 0 . The forces on opposing faces of the cube constitute 
couples that contribute a total, torque of 

-a\ x F(i'«)=a 3 (a 23 —a 32 , a 31 -a 13 , a 12 -a 21 ). 

The angular momentum of the cube corresponding to a spin vector Q is 


faf 2 pa/2 |»a/2 

^C= p{P o)[x X (Q X x)]^ 1 dx 2 dx 3 

J-fl/2 J - a/2 J-a/2 


={p 0 a 5 /6)Cl. 


We suppose that the body forces are continuously differentiable; i.e., 
<^>(x 1 , x 2 , x 3 )= 4>(P 0 ) + (d(f)/dx a ) x*. The torque T* due to the body forces is then 


T>- 



dx l dx 2 dx 3 =(p 0 a 5 / 12)curl </>. 


An argument similar to the previous one now shows that a y =0/,. 

The symmetry of the matrix a is of considerable significance. Suppose 
at the point P 0 we replace the x 1 , x 2 , x 3 coordinate system with another 
Cartesian coordinate system x r , x 2 , x 3 with unit vectors i\, i 2 , i 3 along the 
axes instead of i lt i 2 , i 3 . This determines an orthogonal matrix j=(j„) by the 
relation i',=j r0 i fi or equivalently We consider the coordinates 

n 1 , n 2 , n 3 of n in the x system as constituting a one-column matrix, which we 
also denote by n and similarly for ri. The relation i’jrf' = n = i 0 n fi = 
yields n r =j r0 n* or n'=jn for the one-column matrices. Similarly we have for 
F(n) and F(n')' as one-column matrices F(n')f =jF(n). This also indicates the 
transformation of the matrix a, since a'n'= F(n'Y =jF(n)=jan=jaj~ l n' or 
a' =jaj ~ 1 =jaf. 
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Since a is symmetric, an orthogonal matrix j can be chosen so that a' is 
diagonal. Now if n is a given direction, F(n) perpendicular to the surface 
element da is equivalent to F(n )=An = an, or n is a characteristic vector of the 
matrix a. Since a is symmetric there always are three such directions at each 
point y 1 , y\ y 3 that are mutually orthogonal, and these can be taken as 
*i> * 2 . a °d i's to determine the matrix j. If the characteristic values of a which 
we denote by A 1( A 2 , A 3 are distinct, these three directions are determined up 
to a constant at each point y 1 , y 2 , y 3 . If the A, remain distinct through out a 
region 91 in the substance, we can consider them as varying continuously and 
the vectors i\, i 2 , and i' 3 as well. 

The matrix a is a “tensor ’ in the original sense of the word and gives 
the stress F{n) throughout the substance. It is clear that we can associate a 
with a field of three mutually perpendicular vectors A,i',, A 2 / 2 , A 3 f 3 distributed 
through the substance such that F(i' r )= A r i' r , where A r is the force per unit area 
on an infinitesimal surface element da perpendicular to the direction i' r . Con¬ 
versely, given such a field, the tensor a is determined. 

The expression for the force on an infinitesimal element of surface per¬ 
mits us to express the resultant of the contact forces on the boundary 93 
of a region 91 at a time r. 


F c = 


— JJ dA 

9 

- JJ i*(a *i dy 2 dy 3 + a a2 dy 3 dy 1 + n a3 dy l dy 2 ) 

"Jlf ‘•^*w +S -0) l,y ' dy ' dy> - 


a 

Here the a u are to be considered as functions of y l , y 2 , y 3 , and r, that is, of the 
situation as it appears in the medium. Thus, the force acting on the substance 
in a region 91 is given by an integral 


* v d > % - 

where <D is a vector expression for the body forces. 

The Eulerian description describes the activity of a substance in terms 
of spatial position and time. The movement of a body in particular would be 
expressed by its density p=/>(y 1 , y 2 , y 3 , t) and three velocity components 
v 1 = v l (y, t). For this description the acceleration y is the intrinsic derivative 
of the velocity vector 


.. dv dv „ 

y -J, + Sf v - 
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and Newton’s law/ dV=py dV becomes 

e 


\ 2 3\ • v dv dv „ 

® ,) ’ a “~T, + e?*- 


With the equation of continuity we now have four equations on the four 
quantities p, p 1 , v 2 , v 3 , provided we know the a i} values. 

In the alternate descriptions, points in the substance are identified by 
their position, say, Xq, *o, Xq in a certain reference situation, and the current 
position is given by y=y(x 0 , f). We have used this in describing bodies whose 
shape is approximately preserved. For a fixed value of r, this yields a spatial 
transformation Xo-^y. Let 


j_ d(y l ,y 2 , y 3 ) 
d(xo, xq, xg)' 

Under this transformation we can regard a y region 91 and its boundary 93 
as the images of corresponding 91 q and 93 0 in x 0 space. From the formula for 
changing variables in an integral we have p(y, t)J = p(x 0 ). Similarly the 
resultant of the body forces is given by 


JJJ <D(y)dF= JJJ <Hy(x 0 ,t))JdV 0 . 

a «o 

If the expressions daddy* are available as functions of y or even their partial 
derivatives, we can make a change to a dV 0 integral. Alternately, we can 
transform the integral 


JJ ijfl .i dy 2 dy 3 + a„ 2 dy 3 dy 1 +a t3 dy 1 dy 2 ) 

9 


by using equations in the form 


where 


dy 2 dy 3 = = t l' dx o d*o + <li dx% dx^ + n] dx l 0 dx 

i _ d(y 2 , y 3 ) 


1 i=- 


etc. 


d(xg, x^) ’ 

The resultant of the boundary contact force is then 

- JJ dxl dxl + a xP rj{ dx 3 0 dxl+a^ dx' 0 dxl). 

% 

We can write a?j = a^) and consider it as depending on y and the partials of y 
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relative to the space variable. Of course, we must look into this further. The 
boundary force now can be expressed 

'/// ' a J? (a * f) dV °' 

«o 

Newton’s equations now become 

These are three equations on the three functions /(x 0 , t) y 2 (x 0 , t), y 3 (x 0 , r), 
and to use them we must find the a?-. If we wish to consider motion in the form 
y = Y+j x or y = Y+j(x o + u), these, of course, become equations on x =x(x 0 , t) 
or u=u(x 0 , l). 


m «* • 

U 0 it J 

> 

t*« k 

0i & I* 


7.3. Deformation and Stress 

We must now discuss ways in which the tensor components a u can be 
determined. Stresses within a body are usually associated with a “change of 
shape.” Stresses appear because the substance is distorted or strained. The 
notion of change of shape requires that we be able to identify points in the 
substance and determine their positional history. Also, there is a reference 
shape or position for the body that corresponds to an “undistorted” shape 
and that can be used to specify points in the body. Distortion is purely a 
spatial relation, i.e., if we take a snapshot of the body at any instant in time, we 
can determine the distortion by comparing purely spatial aspects with the 
reference. 

It is convenient to assume that in the reference shape the center of gravity 
is at the origin. We chose a Cartesian coordinate system for this reference 
position and specify a point P 0 by coordinates x l 0 , x£, xg. Now, if we move the 
substance congruently so that P 0 goes into a point Q 0 with coordinates 
yo~y+jx 0 , with orthogonal matrix ), then the new position is also an 
undistorted reference shape. A distorted position can be considered as 
corresponding to a mapping x 0 ->x(x 0 ) if P 0 goes into the point Q with 
coordinates y= Y+jx. The distortion then does not depend on the vector 
Y, and the distortion is the same for all mappings x 0 =x(x 0 ) for which x(x 0 )= 
j x(j x 0 ) where ) and / are orthogonal matrices. In the actual motion Y can 
be taken as V(r) and we may be able to determine j=j(t) so that x(x 0 , f)= 
Xo + uIxq, t) for u small. But distortion is not a notion associated with time 
but is a property of a class of spatial transformations. A transformation 
*o-»*(x 0 ) determines a class of transformations xq-►T+) , xO”Xo), but any 
transformation in this class also determines the same class in this way. 
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Consider then the mapping x 0 ->x(x 0 ), which is given by three functions, 
x‘=x'(Xo, Xo, Xj|). We will suppose that these functions are continuously 
differentiable up to the second order, that the Jacobian 

d(x‘,x 2 , x 3 ) 
d(xj, xg, Xq) 

is positive, and that it is bounded away from zero. Thus, the amount of volume 
compression at a point is restricted. 

A point P in the neighborhood of P 0 will have coordinates x‘ 0 + l‘dr 
with dr small and (/ 1 ) 2 + (/ 2 ) 2 +(/ 3 ) 2 = 1. Under the transformation x 0 ->x(x 0 ), 
P will go into a point Q with coordinates x 1 ' + J[P dr with J‘j=dx‘/dx^ For dr 
fixed, we have, then, a transformation J such that the unit vector / is taken into 
//=(/)/*, ^ 2 P,/ 3 P). Correspondingly we have a quadratic form <x. x pl a l fi =Jl- 
Jl for the transformed length squared of / with positive definite matrix a. We 
can find a positive definite matrix A such that A 2 = a. A has the same charac¬ 
teristic vectors as a, and the characteristic values of A are the square roots of 
the corresponding values of a. Thus Al ■ Al = A 2 \ • l- 0 L„ p l x l fi = Jl ■ Jl=J'Jl • /. 
We also have A 2 = ct=J'J, and for any vector /, ||/4/|| = ||j/||; i.e., the length of 
Al equals that of Jl. 

The matrix A is nonsingular, since Al= 0 implies Jl= 0 and /=0. The 
matrix j=JA~ l has the property that for any vector /, 

MHI-'M-'olhMM-'OlHI# 

This yields 

y/ ^' = [|| J f(/-F/')|| 2 -|U(/-/')|| 2 ]/4 = (||/-l-/'|| 2 -||/-/'|| 2 )/4 = /-/% 

and hence) is orthogonal. We have jA = J. This is the "canonical resolution” 
for J with A positive definite and j orthogonal. We have an alternate resolu¬ 
tion for J in the form J = Bj with B = jAj~ 1 also positive definite. Correspond¬ 
ing to J'J = A 2 , we have, since J'=fB=j~ l B , JJ'=B 2 . 

Thus, the transformation xq—* , x(x 0 ) determines a field of transforma¬ 
tions, J=(dx‘/dx J 0 ), as well as two other fields of matrices A and j. The matrix 
A can be interpreted as follows. Take a small cube of the substance with 
vertex at P 0 and sides parallel to the three characteristic vectors of A. Let 
Pi> P 2 > f l 3 be the corresponding characteristic values. Compress each side 
of the cube in the ratio of the corresponding //,. This should be done so that 
any smaller cube inside the given cube and with the same side directions is 
similarly compressed. The action of J, then, consists of such a compression 
followed by a rotation). If we use the alternate resolution J = Bj, we first 
rotate the chosen cube by ) and then compress as indicated in the given 
ratios, which are also the characteristic values of B. If we use any other 
transformation of the same class, J is replaced by j'Jj’ =fjAj" = (j'jj')j" ~ 1 Aj". 
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Thus, the new vector field J' has A'=j"~ l Aj". This is just a rotation of A by 
i', which is the same at all points. 

This indicates that either the compression due to A or that due to B 
corresponds to the distortion of the substance. In fact it can be shown that the 
field A determines the field j up to a constant multiplicative j 0 . 

If we consider the B compression of the rotated cube as associated with 
forces perpendicular to the faces of the cube, the area average of these forces 
yields a system of stresses of the type we have associated with the matrix a in 
the previous section. For now we have three perpendicular directions, and 
for each of these directions the stress on the plane surface to which it is 
normal is in the same direction. Each such direction is therefore a character¬ 
istic vector of a. We must still relate the face stresses, that is, the characteristic 
values A,, A 2 , k 3 of a, with the compression ratios /j,, n 2 , /i 3 , which are the 
characteristic values of B. 

We can now consider how this relationship can be determined experi¬ 
mentally with no reference to infinitesimals. For suppose a cube of side a 
of the undeformed material is confined within a very strong box of sides 
Bi a < in such a way that the compression is similar throughout the 

substance. The stress on each face should be uniform and the stresses on 
opposing faces must be negatives of each other if the box does not move. We 
assume by symmetry that these stresses on opposing faces produce no torque, 
and hence these stresses are perpendicular to the faces. These arguments 
apply to any smaller cube within the compressed substance by the assumption 
of similar compression. Now if Aj, A 2 , A 3 are the values of the stresses on 
opposing faces, then one has a uniform situation throughout the substance 
in which the compression of the matrix B is associated with the tensor a 
with the same characteristic directions and characteristic values A,, A 2 , A 3 . If 
we specify the and determine the A f by measurements, we obtain A f as a 
function of //,, /i 2 , n 3 , or we could proceed in the other direction. Referring 
the infinitesimal situation to a finite one is, of course, the essential idea of 
calculus, that is, “The Method of Fluxions.” 

Unfortunately there is a difficulty we must now face. In general 
when the A f are measured, they will be found to depend on the way the desired 
compression was obtained, that is, on the time history. If we measure very 
quickly the substance may get warm and yield, say, higher values than if we 
do it more slowly. If we compress it to higher values and then permit it to 
recede to the desired compression, the values of A j, A 2 , and A 3 are often much 
lower than the ones obtained from a monotonic compression. Thus, the time 
history of the compression must also be specified for the experiment, and the 
history of the compression of the infinitesimal cube in the analysis must also 
be specified. 

There are, however, certain practical cases in which simpler relations 
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between the A, and /t, can be assumed, at least as approximations. Let us 
consider the significance of the assumption that the A, are functions of the 
Hi- Consider, then, a specific procedure for compressing the given cube of 
material from, say, n, = 1 to values /t®, /t 2 . [i®. This means that each /t ( is given 
as a function of time, and we will suppose the procedure occurs in the time 
interval 0<f < 1. The work done in compressing the cube of material is then 
given by 

w -°‘ £ (-w, £ - w, f -*** &)*. 

where the A, are functions of the /t, and hence of the time. But if we reverse 
this procedure the cube will now do precisely the same amount of work in 
decompressing, since the A, are determined by the //,. It follows, then, that the 
work W must be present in the form of potential elastic energy in the com¬ 
pressed cube. Thus, the assumption that the A, are functions of n is equivalent 
to the existence of an elastic potential energy function u e of //,, n 2 , n 3 such 
that a cube of side a has, when compressed, the energy a 3 u e . u e is such that 

du e = -k l fi 2 n i dn l —k 2 fi i n l dn 2 -k 3 n x n 2 dn 3 , 

and we have 

. _ dujdui 

A,- 

P1P2P3 

For a substance for which such a u e exists, the total potential energy is 
obtained by a summation in which a 3 is replaced by dV 0 , the uncompressed 
reference volume. The extensive elastic potential energy is given by 

Ue= JfJ Ue( ^ 1 ’ /l2, 

There are circumstances in which the assumption of the existence of an 
elastic potential energy is reasonable. These include the possibility of small 
rapid oscillations in which heat flow can be either neglected or compensated 
for by dissipation terms. One also notices that u e is ultimately dependent on 
certain functions of the partial derivatives, dx‘/dx J 0 , since the //, were obtained 
as follows. One forms the characteristic equation of the matrix JJ' = B 2 , 
which has three coefficients, which are polynomials in the partials dx‘/dx J 0 . 
One solves for the roots of this equation and extracts the square roots. 

Let us consider, then, the motion of a substance for which an elastic 
potential energy function exists. We suppose that its point P 0 , which is at 
*o in the reference position, moves in accordance with 

y= Y(t)+j(t)x(x°, l). 
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For simplicity we will assume that F, j and its spin vector Q are given func¬ 
tions of t. The elastic potential energy is given by an integral 


We consider u e to be a function of the dx‘/dx J 0 and let u eiJ denote the partial of 
u e relative to dx‘/dx J 0 . We suppose that the potential energy of the body 
forces are given also by a spatial integral 


V= 



d(x‘, x 2 , x 3 ) 
d(x l 0 , xjj, xi) 


dV 0 . 


If Y(t) is the center of gravity, the kinetic energy can be written as 

-i JJJ o(^+OV-QvJiF 0 

+'2$ f, (jr +c ‘ ,x '- a ‘ x, ) dV ° 

+i 2$ff l ’(jr +a ' x ‘- a ‘*‘) 2iV « 


The Lagrangian, then, is T— U e — V and one must have a stationary value for 
f‘ r ( T-U e -V)dt relative to variations in the functions x‘(x 0 , f). The usual 
variation procedures then yield three partial differential equations, the 
first of which is the following 

+ p [n> 0£+OV -0‘x J )-n*(^ + 0'*’-0v)]-|£ K- 0 

for 


d^.x 2 ,X 3 ) . _ d(x 2 , x 3 ) t _d(x\x*) 

3(xo, Xq, x^) ’ 1 5 (xq, xg) ’ 2 3(xo, xj)’ 

etc. There is one case in which this expression simplifies. If 0(x\ x 2 , x 3 ) can be 
expressed i//(x l , x 2 , x 3 )p where 4* depends only on x\ x 2 , x 3 , i.e., the current 
position, and p is the density, then changing variables yields i j/pK as the 
integrand relative to dV 0 . But pK = p(x 0 ), so that V= Si! ij/p 0 dV 0 , and in the 
above one can omit the K\ terms and replace (6<f>/dx l )K by p(d\j//dx l ). 
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Presumably the u el0l are functions of the first partials of the x‘ relative to 
the x j 0 so that the first line contains second derivatives relative to both time 
and the spatial variables. The initial situation for the motion has to be 
specified. 

Clearly the general case is complex. One simplification is associated 
with assuming that x can be expressed as x 0 + u with u small. The objective 
is to obtain a simple approximate expression for B. One has 


The components of the second matrix are usually written 

-If— —\ 

e,j 2 \<3x J o + dx'o/ 

If we consider u e to be a function of //'=/i, — 1, then we can consider u e to 
depend on the three functions p, =e n +e 22 +e 3 s, p 2 =en^ 22 +« 22^33 
+ £ 33^11 —£12 —£ 23 —e 2 1 , and p 3 = det(e u ), since the p\ are roots of x 3 —p,x 2 
+p 2 x~P3=0. Since p 2 —2p 2 is always positive, u e is frequently taken in the 
form Api + B(pi — 2 p 2 ). 

A simplification of practical interest is associated with the experiment 
used to establish Young’s modulus of elasticity. If we take the cube and subject 
the x 2 , x 3 face to a tension T, then p, = 1 + T/E, p 2 = 1 —oT/E, p 3 = 1 —aT/E, 
where Tis force per unit area and £ is the modulus of elasticity. If 5p f =p, — 1, 
these equations can be written £ <5p, — T,E Sp 2 = —<*T, E <5p 3 = —<jT. If we 
apply tensions to the other four faces and suppose the effect is additive, we 
have 

£ Sp 1 = T i —oT 2 —oT 3 
E Sp 2 = —STi + T 2 —<jT 3 
E Sp 3 = —aTi —oT 2 + T 3 . 

The equations can be inverted to yield 

(1 — 2<rX<r+ 1 )Ti =(1 —ff)£ <$p, +<tE Sp 2 + oE5p 3 , etc. 

For metals there is a range of TJE for which these relations are of practical 
significance. Usually this range is limited by, say, TJE < 10" 3 . The ratio a is 
called Poisson’s ratio. One assumes, of course, that the face forces are pure 
tensions. 
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7.4. An Elastic Collision 

We have as a matter of policy refrained from discussing the solution of 
partial differential equations, since that is precisely what is readily available 
in mathematics courses and many other courses. However, it does seem 
desirable to carry through one example in detail. The example is simplified 
in the sense that it is based on a one-dimensional version of the three- 
dimensional discussion given above. This will permit us to handle it by 
purely formal methods, while in most practical cases, which involve three 
dimensions, numerical procedures are required. 

We consider two steel cylinders or bars with circular cross sections and 
each of length /. It is convenient to think of / as large relative to the diameter 
of the cross section. These bars can move lengthwise in a horizontal trough 
of semicircular cross section in an essentially frictionless manner. The use of 
an air cushion effect would yield this possibility. 

We suppose that initially one bar is standing still and the other bar is 
approaching it with speed v. The bars collide elastically, that is, without loss 
of energy. After this, the first bar moves with the original speed v and the 
second bar is motionless. This result can be readily predicted by Newtonian 
relativity, but our interest is in what happens in the bars themselves. 
fl , We choose the x axis along the line determined by the axes of the cylinder. 

B t The origin of this axis is chosen so that the first bar initially is in the interval 

O^xsS/. Time is determined so that f=0 corresponds to the instant of first 
#« contact, and hence, at r=0 the second bar is located in the interval — /sgx^O. 

f At this instant the first bar is still, the second bar has speed v directed posi- 

ii f * li . tively along the x axis. 

The collision obviously involves a transformation from kinetic energy 
j i “ int0 elasl 'c potential energy and then back again into kinetic. The relation 

governing this transformation is determined experimentally, but it is simpler 
than that of our previous discussion, since we are interested only in one¬ 
dimensional effects. If we take a bar of length h and compress it with a force 
F to a length h-e, then for a steel bar it is found that for a considerable range 
of F, F= EA(e/h), where £ is a constant and A is the area of the cross section. 
The energy involved in compressing the bar to this extent is S e 0 F de=EAe 2 / 
2h = Fe/2. E is very large, so that e/h is usually small. For steel, £ is about 
2 x 10 12 in the cgs system. Thus, if a kilogram of material is supported by a rod 
of 1 cm 2 in cross section, the weight of the material corresponds to F=0.98 
x 10 6 dynes, so that e/h is about 0.5 x 10 -6 . 

We take as our reference situation the position of the bars at the instant 
of contact. For t >0, the cross section of either bar with abscissa x 0 in the 
reference position has abscissa x=x 0 + m(x 0 , f). We take a subdivision of the 
initial interval —/^x 0 ^/ with endpoints x^_ t , Xj such that the compression 
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in the interval Xj _,, Xj is essentially uniform. The uncompressed length of the 
bar in this interval is h = Xj-x } . , = Ax,. The new length is 

h—e = Xj + u(Xj, t) —X;_! —U(X;_ i, t) 

= Ax ; +^ (X'j, t)AXj 

so that 

e= - ^ Ax' and F = EA(e/h)= —EA-^- (x’ jy (). 
dx ox 

Thus, the force across a cross section with initial abscissa x is 

-EA ^ (x, f)=F(x). 
ox 

For a segment xi<x 0 <x 2 the rate of change of momentum in the 
positive x 0 direction is 

Tt\ Ap° ^(y, t) dy= -F(x 2 )+F 0 {xi) 

_ f du . . du ”1 

=EA [y x (X2 ’ 

i.e., the compression on the x 2 cross section will slow the segment down while 
that on the x { cross section will accelerate it. Notice that the right-hand side 
is really a surface integral. Numerical procedures usually are based on 
equations of this type. 

If we let c 2 = E/po , we have 



Now if we can find a time interval and a spatial interval x t ^x^x 2 

such that du/dt is continuously differentiable on this interval, this relation 
yields 



Thus, for a range of x 2 , this equation can be differentiated and yields 

8 2 u_ 2 d 2 u 

n?~ c &?• 


(2) 




The solution of this partial differential equation is in the form 
u(x, t) =f(x+ct)+g(x—ct ), 

where/ and g are arbitrary twice-differentiable functions of a single variable. 
Unfortunately this is too restrictive. For example, 

du 

(*> f) = c[/'(x + ct)-g'(x-cf)] 
and our initial conditions require 
du 

^(x,0)=v for -l^x< 0, 
du 

(x, 0)=0 for 0 ^ x ^ /. 

Hence, du/dt is discontinuous, which implies that either f'(x ) or ^(x) is 
discontinuous or both and d 2 u/dt 2 is not available. The solution is to con- 
sider Equation (1). We have 

du 

y x (x, t)=f’(x+ct)+g'(x-ct), 

and if/ and g are integrals of their derivatives, then the indicated operations 
on the left-hand side of Equation (1) can be carried out and thus u can be 
taken in the indicated form with only the restriction that/ and g be differenti¬ 
able and equal to an integral of their derivatives. 

Now/ and g must be determined by the boundary conditions. These are 


(a) 

u(x, 0)=0 

for -/<x</, 

(b.) 

du ^ 

^(x, 0)=t> 

for -/s£x<0, 

(b 2 ) 

du 

^(x,0)=0 

for O^xsS/, 

(c,) 

du 

for r$s0. 

( C 2) 

du 


S (U).0 

for f>0. 


These conditions can be used to determine / and g over the range of 
interest. Thus, / is defined 

{ vy/2c, -/sgysSO 

0, 0sSys£2/ 

(v/2c)(y-2l), 2/<y<3/, 
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and g is defined 

f 0, 

9(y)=\ ~vy/2c, -2/^y^0 

i vl/c , -3/<y<-2/. 

Both/ and g are equal to integrals of their derivatives, but the derivatives are 
discontinuous. 

The functions / and g describe the situation throughout the collision. 
The development is most easily visualized by first graphing g, and g'. The 
situation at time t is obtained by shifting the graph of/ and/' by ct to the left, 
and that of g and g ' by ct to the right. 

For the time 0 ^t^l/c there are three zones. For — /^x^ —ct the x 
cross section is moving with velocity v and has zero stress, and this segment 
has been displaced to the right an amount vt. For — ct^x^ct the x cross 
section is moving with speed v/2, the stress of compression is Ev/2c, and the 
displacement varies uniformly from vt to zero. For ct ^ x ^ / the displacement, 
speed, and stress are zero. During this time the second zone expands evenly 
until at t = l/c the other two zones have length zero. 

During the time l/c^t^H/c there are also three zones. For —/^x^ — / 
+ (cf—/) = cr — 21 the displacement is vl/c, but the speed and stress are zero. 
For ct — 2l^x^2l —ct the speed is v/2, the compression is Ev/2c, and the 
displacement varies from vl/c to vt—vl/c. For 21—ct ^x^2/ the speed is v, the 
compression is zero, and the displacement is vt—vl/c . During this time, the 
middle zone contracts until at time t = 2 l/c each bar has had a displacement of 
vl/c, but the lefthand bar has speed zero, the righthand bar has speed v. 
There is no compression, and the bars begin to separate. 


7.5. Thermodynamic States and Reversibility 

As discussed in Section 7.2, in the Eulerian description of the behavior 
of substance, the equation of continuity and Newton’s third law yield four 
equations to determine the quantities p, v 1 , v 2 , v 3 . The situation is similar in 
the alternate approach, i.e., we have three dynamic equations for the y‘(x 0 , t), 
and the density is given in terms of the reference situation density and the 
Jacobian. But in both cases we must know the a^, which have to be determined 
on some empirical basis that must be added to the geometric and Newtonian 
principles. 

The simple experiment we described of compressing the cube of sub¬ 
stance and measuring the stresses relative to the given deformation has 
difficulties that are rather obvious based on general experience. We know 
that if we compress something, we probably won’t receive back the work we 
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did in compressing it. We ignored temperature and the possibility that heat 
will flow and that this will affect the measured quantities. Actually the situa¬ 
tion is quite complex. We must obtain a reliable pattern of experience that 
can be formulated in mathematical terms. This formulation will involve 
additional intensive and extensive functions but will be in the same analytic 
framework of functions, partial differentiation, and geometric integration. 
This formulation is part of the more general theory of thermodynamics. 

We now specialize the a tj . Simplication is desirable and we also need to 
make contact with the readily accessible literature. We assume that the a are 
in the form and the matrix a=p(y)I. For each n, the stress F(h) is 

along n and at a given point, F{n) has the same magnitude for every direction. 
This situation is certainly valid in a stationary fluid. 

Let 91 be a region in the substance with boundary®. The force on the 
substance in 91 due to pressure on the boundary is 


F(n)dA=-jj pndA 
s * 

= ~ JJ POi dx 2 dx 3 + i 2 dx 3 dx l + i 3 dx 1 dx 2 ). 

co 


If we assume that the body force density is expressible as tfHyMy), then in the 
Euler description the integral form for Newton’s third law is 

a a 

If p has continuous derivatives, this yields 

1 . , Dv 

-grad />+<!> = — . 

P Dt 

The component relations of this vector equation are called the Navier- 
Stokes equations. When there are surfaces of discontinuity for the pressure, 
the integral form is required. Such surfaces of discontinuity are called shock 
waves. The situation for the alternate description is analogous. 

We now consider the compression experiment with the stress appearing 
m the form of pressure. We have the associated ideas of heat and temperature. 
Temperature is an intensive function. Heat is a flow of energy across a 
surface and occurs when there is a temperature gradient from a hotter to a 
cooler region. The rate of heat flow may be large or very small in response to 
a given temperature gradient, depending on the surface. We will suppose that 
the only actions that occur are compression or expansion and heat exchanges. 
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This would be the case for an ideal gas and represents the simplest situation in 
which thermodynamic principles can be formulated. Usually chemical and 
electrical aspects of energy also must be considered. 

Thermodynamics describes the way in which a substance is affected by 
compression, expansion, and heat exchanges and how it develops spon¬ 
taneously in various situations. But the basis of this discussion is certain 
idealized experiments for which a number of thermodynamic functions can 
be defined and in which these functions have certain mathematical relations. 
One hopes these relations hold for bodies of very small spatial extent so that 
they can be used in infinitesimal analysis, and this is the real significance of 
the ‘‘idealization.” For experiments involving a finite amount of substance 
the idealized procedures have two characteristics. One of these is that a body 
of substance in the experiment must be spatially uniform. Thus, in such a 
body the density, pressure, and any other intensive function has to be uniform, 
and any extensive function must have a uniform density. Hence, the develop¬ 
ment of the experiment can be described by giving the functions as functions 
of the time, i.e., a curve or path in a space whose coordinates correspond to 
these functions. A situation in which the intensive functions are uniform in a 
body is called a thermodynamic state. Thus, each point on the path associated 
with such an experiment is a state. The second characteristic of the idealized 
experiments is that if (E is the path of the experiment in the function coordin¬ 
ate space going from P { to P 2 , there is also an experiment reversing the path 
(E and going from P 2 to P x . The procedure is then said to be reversible. 
Irreversible processes include breakage, cold working, or plastic set. One 
may be able to return to Pj, but not by reversing the path. 

For a substance of finite extent the type of change postulated in which 
the intensive functions vary uniformly in regard to space can be at best only 
approximated. It is usually assumed that the substance is confined in a 
cylinder by a piston and that the walls can be modified to permit heat flows. 
But any motion by the piston or any heat exchange will certainly introduce 
spatial variations in pressure and temperature. It is assumed that uniformity 
can be approximated by proceeding very slowly. But this is not the way the 
relations described in these experiments are verified. Instead the con¬ 
sequences of these relations, especially in infinitesimal analysis, are compared 
with experience. Clearly these experiments are theoretical concepts utilized 
in the mathematical formulation of experience. 


7.6. Thermodynamic Functions 

Consider, then, such a piston-cylinder-substance apparatus that varies 
only in such a way that the substance is always in a thermodynamic state. 
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or the substance there must be one extensive quantity, and there is an 
advantage in taking it to be the mass, which is a constant. The intensive 
quantities clearly include pressure, density, and temperature. For reasons 
that w.H be apparent it is customary to replace the density by the volume of 
ubstance, i.e., the mass divided by density. There are circumstances in which 
is replacement is awkward. However, two other intensive functions appear 
the interna 1 energy per unit mass and the entropy per unit mass. To agree 
with the usual notation, which implies that one is dealing with the corres¬ 
ponding extensive functions, we will take the mass as a unit and let U stand 
lor the internal energy and S for the entropy. 

There are two ways in which the substance participates in energy 
exchanges. One of these is the work done by the substance by moving the 
piston, say, an amount ds. Then dW=F ds=pAds=pdV. We suppose that 
we can control the nature of the walls of the cylinder relative to the trans¬ 
mission of heat The amount of heat entering the substance is denoted by 
dy. The principle of conservation of energy states that dU=dQ-pdV. Since 
there are two independent ways in which V can be varied, the possible 
paths for a thermodynamic process lie in a two-dimensional surface. In manv 
cases this surface can be parametrized by choosing two of the intensive 
functions as parameters and expressing the others in terms of them, and this is 

^nd volumT” y V° ne ' ^ ° f 3 P3ir of functions is the Pressure 

Consequently the differentials we deal with are differential forms on two 
independent variables. The differentials dW and dQ are such forms, but 
dU an exact differential. We also have in the coordinate function space for 
the substance an exact differential dS such that dQ=TdS, where T is the 
temperature and S is the entropy. 

An interesting example is represented by an “ideal gas.” There are 
certain relations between the functions for this theoretical substance and 
hey are derived from a kinetic model of a gas. For an actual gas these’rela- 
tions are verified experimentally, and they usually apply if the gas is not near 
condensation. These relations are pV= MRT and U=\MRT, where T is the 
absolute temperature and R is a constant chosen so RT has a value in the 
appropriate energy units. Notice that these relations can be written, p=pRT 
and u - 2 RT, where p is the density and u is the energy per unit mass. We will 
follow the more usual choice of function, that is, V, and also will take M = 1 
We have then 


or 


T dS=dQ=dU + p dV=jR dT+pdV 
2 \T 3 RT) 2 K \T 3 V) 
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or 

S=§R In (TV 2 < 3 )+C=*R In (pV 5l3 )+C\ 
where C and C are constants. 

Of course, the assumption that we have used earlier that there is an 
elastic energy function means that the coordinate function space is one¬ 
dimensional. For this type of situation we must have dQ= 0, that is, dS =0 or 
S is a constant. The system then must vary in the p, V plane along a curve 
pV 5l3 = A. If the substance varies from a point P, with coordinates p u V, 
and temperature T, to a point P 2 , p 2 , V 2 , and T 2 , then the work done by the 
substance is 

W=j pdV=-j dU=U\ 2 =%R(T l —T 2 ). 

The alternate form of the adiabatic condition, TV 2,3 = B , yields 



The adiabatic assumption is used in treating the propagation of sound, since 
the oscillations are fast enough so that the immediate heat flow can be 
neglected. 

In general the “substance” may have a more complex structure, and dU 
may have additional terms. For example, a voltage cell may produce a 
transfer of electric charge and one adds a term t;(— de), where v is the voltage. 
A major practical aspect of energy transformation is chemical, and one has 
composition ratios and “chemical potentials” p, that constitute terms 
which are added to dU. Electric and magnetic fields may induce 
polarization, magnetization, and electric currents. Thus, the dimensionality 
of the intensive function coordinate space may be much higher. All these 
energy forms act in conjunction with heat flows. Heat flows can be between 
bodies in contact or by radiation. In general changes in U will correspond to 
gradients in the coefficients p, T v , p„ etc. or differences in their values for 
different bodies. Rates of change usually require additional empirical 
information. 


7.7. The Carnot Cycle and Entropy 

We consider again the case of a single substance whose two-dimensional 
function coordinate space can be parametricized in terms of pressure and 
volume, p and V. Suppose we have such a substance in a cylinder with a 
piston and two reservoirs for heat, one at a higher temperature than the other. 
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We suppose that we have controlled heat-conducting connections between 
the reservoirs and the substance in the cylinder. There is a procedure. A, by 
which heat can be transferred through the cylinder-piston-substance appar¬ 
atus from the hot reservoir to the cool one and partly converted into work 
W A . There is also a procedure B by which an amount of work W B can be used 
to produce a heat interchange in the opposite direction. In practice, however, 
if we apply A and B or B and A successively so as to restore the original 
situation in the reservoirs and apparatus, then W B >W A . There are always 

uncontrolled heat flows, which account for this difference as an increase in 
entropy. 

There is, however, a thermodynamically idealized version of procedure 
A that is "reversible.” This is termed the Carnot cycle. We will suppose that 
the apparatus substance is a perfect gas, and the time development of this gas 
will be described by curves in the plane with Cartesian coordinates p and V. 
This cycle has four phases, each of which corresponds to a segment of a 
curve. We begin the first phase at a point P„ p„ v u which is at temperature 
the same as the cooler reservoir. In the first phase the substance is com¬ 
pressed adiabatically from a volume t>, to a volume v 2 , the abscissa of the 
point P 2 , p 2 , v 2 so that the temperature increases from r, to t 2 , the tempera¬ 
ture of the hot reservoir, in accordance with formula 



from the previous section. The work done by the substance in this phase is 
negative, W t = 2 R(t , —t 2 ). By means of an infinitesimal gradient heat is now 
permitted to flow from the hot reservoir into the substance that remains at the 
same temperature t 2 . In this second phase, U for the gas is a constant, and 
thus the work done, W 2 , by the substance equals the heat, Q 2 , received, and 
furthermore, for M = 1 we have 


W 2 = 



pdV=Rt 2 



The third phase is an adiabatic expansion from v 3 ro t> 4 corresponding to a 

r ® tu ™ ° f thC 83S t0 tem P erature ft? ie -> we have t 2 /r, =( v Jv 3 ) 213 and 
W3-2 R (h-ti)- We again have a isothermal contraction in the fourth 
phase with heat Q 4 flowing from the substance into the cool reservoir 

0 , 4 . = - IT 4 , and 


-Rt\ In (vjvx). 

This isothermal contraction restores the original volume t>, at temperature 

t„ and hence, the original pressure Pl is also restored. This cycle is illustrated 
in Figure 7.1. 
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Thus, in this process the apparatus is restored to its original condition, 
an amount Q 2 = Rt 2 in (v 3 /v 2 ) of heat has come from the hot reservoir, and 
amount Q 4 = Rt { In (vjvi) has entered the cool reservoir, and the difference 
Q 2 — Q 4 corresponds to the total work done by the substance. Since u 2 /t>i = 
(ti/t 2 ) 3,2=z v 3 /v 4 , we have v 3 /v 2 = vjvi\the output work is 

W 2 + W 4 = Q 2 -Q 4 = R(t 2 t ) In (v 3 /v 2 ). 

The hot reservoir has had a decrease in entropy Qz/t2 = R In (v 3 /v 2 ) and the 
cool reservoir an increase in entropy Q i /t l , which has this same value. This 
whole procedure can be reversed by cycling in the inverse order. Thus, one 
could continue to convert heat into work until the temperatures are equal or, 
conversely, produce a heat difference by work. 

It is interesting to see what one should expect in practice if one tried to 
realize this set up. Suppose that in regard to heat the combination of appar¬ 
atus and two reservoirs are completely isolated. There would be, of course, 
heat leaks from the hot reservoir to the cool one, and if an amount of heat 
Q leaked in this way, the two reservoirs would have an increase in entropy 
Q(l/t x — l/t 2 )= z Q(t 2 — t 1 )/tit 2 . In the adiabatic compression stage the com¬ 
pressive work would exceed the actual increase in energy U, and presumably 
the extra work would be converted to heat leaking into the cooler reservoir. 
In the heat absorption stage the actual temperature of the substance is less 
than that of the reservoir to induce a flow of heat, and hence, the entropy gain 
of the substance is greater than the entropy loss of the upper reservoir. There 
is a converse result in the fourth stage so that the increase of entropy in the 
cool reservoir exceeds the loss of entropy of the substance. In the adiabatic 
expansion the work obtained will be less than the change in U. Thus, in 
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genera! one can consider the actual operation as that of the theoretical 
Carnot cycle with a leak, and this permits a definition of efficiency as the ratio 
of output work to the difference <? 2 -fi. When (> 2 has been obtained b^the 

infinit!T P ^ ° f fUC and thC ,OWer reservoir is considered to be essentially 
work to Q 2 1 U IS m ° re CUSt ° mary t0 define efficiency as the ratio output 


7.8. The Relation with Applied Mathematics 

nart J 1 h ^ r ° CedUreS ° f thC Ca ' Culus based on such n^ions as functions 
partial differentiation, and geometric integration constitute a mathematical 

format that is applicable to three major aspects of the exact sciences. These 
are Newtonian dynamics (including the variational developments), electro- 
magnetism, and thermodynamics. 

aSpmS - b “' ,here ^ I* a mainstream of 
activity that provides the necessities and amenities of life, and there must also 

commumeanon. In our euirure these three science areas structure 
technology for this mainstream and communication. This is evident in the 
role of machinery for production and transportation and in our methods for 
communication and handling power. 008 ,or 

Scientific understanding has a unity that does not permit facile dissection 

role of therm 7“ ^ ‘ S m ° St a PP ro P riate indicate the 

methnH? f d r amiCS ' n appl,cat,ons - Exce P» for water power, our major 

eneravtto Z,™"* ^ °" firSt converti " g a Iate "‘ ^ource of 

energy into heat, second, converting this heat into mechanical energy and 

the tmT? 1 " 8 th t meChan ‘ Cal energy into Metrical. The understanding of 
the second step in this process was both of great practical value and phflo- 

ophical significance. This understanding was required both for the efficient 
production of power and for the use of power in transportation. 

f . .ho 6 a " lv ® rsallty of the first and second laws of thermodynamics colors 
thought of all persons who have grasped its meaning. If like the ancient 
philosophers one seeks for a single element that is the fundamental constitu¬ 
ent of everything, then there is only one candidate available in light of our 
present know.edge-tha, is energy. The laws of thermodynamic are th" 

bevond .h 6SCnb T thC C ° UrSe ° fCnergy CVen *° ,he most esoteric extremes, far 
beyond the usual experience that we have been discussing 

A very important application of thermodynamics is to structure the 
ffie^fJnount'T ° fcbem “ try : Thermodynamic considerations limit sharply 
mmZv °r r PiriCal mforma,ion needed to establish the energy of 
composition of chemical compounds. These considerations also yield equi- 
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librium relations and reaction rates. Physical chemistry and thermodynamics 
are essentially inseparable. 

We see, then, a continued complementary development of cultural 
capability and the mathematical formulation of experience. The elementary 
combination of finite set logic and procedures with natural numbers form 
the basis for mercantile transactions and the economic aspects of social 
relations. Geometry represented a highly significant expansion of this 
capability to include magnitudes such as lengths, areas, and volumes. The 
problem of incommensurability appears at first to be simply a philosophical 
question and not one of practical significance. But a satisfactory logical 
structure for the expanded mathematical development proved to be of utmost 
importance. We have seen how calculus (that is, infinitesimal analysis) 
greatly expanded the area of experience subject to mathematical formulation. 

We have emphasized comprehension in mathematical terms rather 
than the subsequent procedures for solving the differential equations that 
result from infinitesimal analysis. But it was the availability of such methods 
that focussed the approach we have discussed. In recent years the limitations 
of classical analysis were considerably relaxed by the development first of 
analog computing and then by digital numerical methods. 

The actual temporal development of these areas in the exact sciences 
was quite complex and subject to many cross currents. The notion of what 
constituted a satisfactory logical formulation of mathematics gradually 
emerged. But this was also associated with a very flexible system of new 
mathematical concepts that were suitable for handling an expanding range 
of experience. 

The classical “method of fluxions” is confined to gross phenomena in 
which the fine structure of matter has only an averaged effect. The effort to 
understand this fine structure in classical terms ran into very serious dif¬ 
ficulties, and new mathematical concepts were required. There were a 
number of motivations for trying to understand the fine structure. One of 
these was precisely the inconsistencies resulting from the classical approach. 
Another was the desirability of replacing the empirical relations needed to 
supplement classical methods by a theoretically consistent development. 
There were also technological rewards, such as the possibility for new and 
extraordinary weapons and communication devices. 

For further discussion of the material in this chapter, see Bergmann,' 1 * 
Denbigh,' 2 * Fermi, <3) Lamb,' 4 ' Love,' 5 * Mahan, ,6) and Mumaghan.' 7 * 


Exercises 


7.1. An airplane makes a tight turn so as to “pull 8 g s." What was the minimum 
radius of curvature? 











I j b ® I 
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how far iMhe moon^ eXpreSS '° n “ sidereal monlh ” mean? Based on this information, 

, 7 ' 3 ' An aut omobile traveling on a level road is brought to a quick stop by applying 
e brakes. Discuss quantitatively the forces, torques, and angular motions involved. 
7.4. For J-idj+du/dx 1 ), J'=(S‘j+du>/dx‘), one has B=(JJ') m . Let r, s, t be the 
elementary symmetric functions of JJ - i.e., if a„ * 2 , « 3 are the characteristic roots of 
JJ one has r-a l +at 2 i-(X3, s = a 1 a 2 + a 2 a 3 + a 3 a l , and r = a,a 2 a 3 . Let p„ p 2 , p, be the 
orresponding quantities for B. One can show that pi = r + 2p 2 , p 2 2 = s+2p tPi , and 

P,-t and that there is a quartic equation for p„ with coefficients expressed in terms of 
f“y s % and t . 

J’ 5 ’ S , uppose one has a" elastic energy function W( Pl , p 2 , p 3 ) expressed in terms of 
he elementary symmetric functions of the characteristic roots p „ p 2 , u 3 of B. Then a has 
the same characteristic vectors and characteristic roots Ai’‘(dW/dp l )p i /p y Then 

1 [(dW dW I dW 

J' 6 ' Le \ T ** an ” x n main* of real numbers. Show thal Ihere is an orlhogonal 
matrix O and a positive definite symmetric matrix A such that 7=0-4. Are 0 and A 
uniquely determined by 7? 

_..„ h 7 ; 7 ;, ! 1 f l IS , a T melri fo a,riX ,0 f real numbers - there is an orthogonal matrix Q 
sucruhat is diagonal. Prove this and discuss the extent to which Q is determined 


j 1.07 

0.1 

0.05 

T = I—0.05 

0.8 

-0.03 

\ 0.07 

-0.15 

0.75, 


find A. B. and 0 for 7 (A and B are positive definite, O is orthogonal, and 7= 0-4 = BO 1 
. , 7 * 9 ' ^. or J ln Exercise 4, write J=I +j, Then B 2 = 1 +j+f+ji'. It would 

be desirable to express an elastic energy function in terms of the invariants of j+r 
that is the coefficients of us characteristic equation. But B 2 is not determined by L r 
so this is not justified. [Hint: Consider the case in which/= —/.] J 

7.10. A metal ring whose cross section has a diameter small relative to the radius 
of the ring is rotating around its center. Show that the stress on a cross section is v 2 p 

rif re wf V s : he den !o ty , and v the linear velocit y of a point on the ring. How much will it 
expand if it is steel? aluminum? What are the highest speeds that can be obtained? 

vHnriiv i d r Sk 0f un,form th,ckness rotates around its center with angular 
velocity w. Let the reference position be that in which w=0. Then the point with cylin- 
ncdl coordinates r. 0, z in the reference position will move to the point r+ u(r), 0, z when 
the disk is rotating. A small piece of the disk will be extended in both the radial direction 
and in the local direction of motion. Suppose that this is due to two simple tensions in 
these directions. Then u satisfies a differential equation 




with C p 0 £w 2 /(I -a 2 ). This can in general be linearized by neglecting second-degree 
terms m u and its derivatives. ^ 

1.12. UJ=(cy/dx J ) is the matrix for a change of variables, /=y‘(x', x 2 , x 3 ), then 
J-jA, where y=(/}), is orthogonal and A is positive definite. The orthogonal matrix j 
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is a function of the x position and constitutes a field of orthogonal matrices. Let 

df, 

w i = < w ui)- 

Then dj = ju),dx*, a) k is antisymmetric, and 


ecu, 


ijk 


da};,, 


cx 


vv + Wfai^.p - =0 


or, in terms of matrices, 

da) k doj t _ 

Tbc 1 ~~ cbc* + 

If a field of co ijk satisfies this system of partial differential equations, it determines a field 
of orthogonal matrices up to a multiplicative constant matrix. There are similar results 
for the 0 )^ = f a (dj i Jdx k ). [Hint : One has the system of partial differential equations 

* d fj 


for the fj values in terms of the o >, Jk .~\ 

7.13. In Exercise 12 one has the relations dy^dx^j^j for /4 = (flij)- These imply 

daj da k 

OJ ixk a aj~~ CO ixj a xk'^"p^k ~~ ~ 'l* 


Let a symbol [ij] be defined by [1, 2] = 3, [2, 3] = 1. [3, 1] = 2 and let curl a denote the 
matrix with components 


(curl 0),^*,= 


da*j 

dx k 


dx j * 


Let aj denote the vector corresponding to the ith row of the matrix A and (curl«), be 
similar. Let a [i,i] denote the vector (o) ljA , cu IJf3 ) and a {in =a i xa j . Then the above 
cu, a relation is equivalent to 

af * a J - b j(<r a • u a ) + (curl a)r <*j = 0 


or 


af • <r j -^[^(curl a) a - u a ] + (curl u), • uj=0. 

Since a 1 • ^=<5* det(4), this yields 

det {A)<j j = Q(curl a) p • curl a) a • dj]a a . 

Thus, the io ijk are determined by the matrix A. The condition of the previous exercise 
now becomes a condition on the matrix A , and the matrix A determines the j field up to a 
multiplicative constant. 

7.14. By considering the inverse change of variables one can show that the B 
matrix for which J = Bj also determines the field of orthogonal matrices j” 1 up to a 
multiplicative constant matrix and hence j. Here j is a field on the y space. 

7.15. A cylinder is said to be uniformly twisted by the transformation z / = az, 
r' = pr, 0' = 0 + yz. Find the matrix B, its characteristic vectors and values, and the 
corresponding stresses using the £, a formulas. 

7.16. Consider an acoustic plane wave in a perfect gas. We suppose that the gas 
is in a cylindrical container with elements parallel to the x axis. The acoustic wave 
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density relat.onship is adiabatic, i ,,p^ /p ^. One Ss PreSSUre 


d 2 u 


Sp 

dx 




H) 


dx) 
d 2 u 

p oj^= 


P 0 

5 d 2 u 


W~3 p dP- 


^^jmSSSSSS» 

e, 10 0 2 and emits heaUn goinTfrom o Vo S StiT ab ?° rbS m « oin « from 
going from *, to * 2 and decrease in going from R^o R, W ' n ' nCreaSe in 



™ Dlscuss ,he en ergy exchanges associated with our various sources of power 
How does entropy increase and what happens to this increase? ^ 
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Probability 


8.1. The Development of Probability 

The theory of probability appears to have risen from two sources, both 
of great antiquity—marine insurance and games of chance. Marine insurance 
was practiced by the Babylonians, Phoenicians, Rhodians, Greeks, and 
Romans. It persisted through the Dark Ages and medieval times. The English 
and other northern Europeans followed Italian models in the sixteenth 
century, and this business ultimately expanded worldwide. Rates were set 
initially by individual insurers, but later much more sophisticated pro¬ 
cedures were required (see Flower and Jones (10) ). 

The formal development of the theory of probability began, according 
to Todhunter, (17) with a certain correspondence, about 1654, between, 
Pascal, Fermat, and the Chevalier de la Mere concerning games of chance. 
Treatises were written by Huygens (1657), James Bernoulli (1705), Montmort 
(1708), and De Moivre (1711). The increasing use of analytic methods during 
the eighteenth century is represented by works of Daniel Bernoulli (1783), 
Euler (1764-1766), Bayes (1763-1765), Lagrange (1770-1773), and Laplace 
(1774). Modern references are Feller (9) and Cramer. (6) 

In the probability theory for dice or card games, there is supposed to be 
for each play, i.e., die throw or hand of cards, a totality of equally likely events 
and a subset of favorable events, and the ratio of the number of the latter to 
that of the totality is the probability of a favorable outcome. In playing the 
game one has a set of compound events with varying possible outcomes, and 
the mathematical theory is analogous to a measure and integration theory 
on finite discrete sets with a multiple dimension structure. The technical 
mathematical problem is to obtain closed formulas for the count of various 
possibilities. The binomial coefficients, (JJ), specify the ways in which one can 
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choose k elements from a set of n, and Pascal invented the well-known triangle 
for computing these inductively. In general, quite ingenious analogies, 
especially with algebraic processes, are possible. Thus, if a die is thrown n 
times, the number of sequences of throws that will have p as the sum is the 
coefficient of x p in the expansion of (x + x 2 + x 3 + x 4 + x 5 + x 6 y\ for a choice 
of a term from each of the n factors of the last expansion is analogous to a 
sequence of n throws of the die. 

Elementary probability theory is a form of measure theory on a struc¬ 
tured set of events. The rules for specifying the measure and integrals in 
accordance with the structure constitute a mathematical theory that appears 
to be applicable to various areas of experience. The question is, when is this 
theory applicable? 

In many instances we are entirely confident about such applications. 
In these cases we seem to have a natural extension of the conceptual process 
by which events are distinguished into “objects” to which set theory (or 
formal logic) are applicable. This additional aspect yields that each of a 
certain set of events are “equally likely.” Presumably this is a long-range 
intuitive integration of experience. The predictions of probability theory are 
themselves probability statements and have an ambiguity that we have 
learned to live with. Do I take an umbrella when the weather prediction is for 
a “20% chance of precipitation”? 

But there are also many situations where this intuitive assignment of 
probability is not available to us, and the mathematical theory of probability 
has been developed to cope with this. In modern science, probability theory 
provides the essential concepts for structuring empirical information into a 
preliminary scaffolding of understanding that may be subject to further 
development. If two events in our experience are associated only by chance, 
we cannot influence the second by dealing with the first. If the use of a certain 
medicine does not correspond to a change of the frequency of recovery from a 
specified disease, it is worthless for this purpose. But where chance occurs, 
the probability measure is often of great practical significance, as we see in 
the case of insurance. 

Thus, in the development of understanding, the probability concept is 
very valuable as guidance for the action required. Consequently the motive 
for the theory of probability is either to show the existence of nonchance 
relationships between events or where chance relations do hold to establish 
the characteristics of the probability measure. Probability theory makes only 
probability predictions, but if a hypothesis about the probability distribution 
in certain circumstances predicts that a specified outcome of an experiment 
has only small probability but the experiment consistently yields this out¬ 
come, we may reject the hypothesis. The practical difficulty is the word 
consistently, and here judgment with a certain connotation of arbitrariness 
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is required. If the whole situation can be immersed in an intuitive probability 
framework so that an a priori probability for the hypothesis is available, 
Bayes’ rule will yield answers within this framework. But ultimately the 
nature of probability itself requires that there will always be a residue of 
arbitrary decision in the acceptance of a theory on an empirical basis. 

Hence, in general the mathematical theory of probability is concerned 
with a sequential or multidimensional structure of events and the appropriate 
measures and integrals. The objective is to obtain “asymptotic results,” i.e., 
methods for approximating limits by taking n large where n is either the 
sequence subscript or the dimension. This theory is immediately available 
in the references, i.e., Todhunter, (17) Feller, (9) or Cramer. (7) The student is 
probably aware of the meaning of such technical terms as distribution func¬ 
tion, frequency, mean, expected value, variance, and likelihood. The asymp¬ 
totic theory is dependent on the use of Stirling’s formula and the Fourier 
transform. Major distinctions are made between discrete events and con¬ 
tinuous sets of events and in the latter between the normal distribution case 
and the more general case in which normality is not assumed. The theory of 
statistics is structured by the objectives of determining nonprobabilistic 
relations and, where probabilistic associations are known, of determining the 
characteristics of the probability distribution. 

Todhunter’s history describes a considerable variety of games of chance 
following the example of Montmort, who believes this to be necessary because 
“pour /’ ordinaire , les Scavans ne sont pas Joueurs .” But in addition there are 
applications to lotteries, life insurance, annuities, demography, errors in 
experiments, the incidence of smallpox and the effect of vaccination on 
smallpox, which was the dangerous procedure that preceded the relatively 
safe vaccination with cowpox of Jenner. There was also considerable philo¬ 
sophical speculation using probability. 

For further discussion of the material in this section, see Cramer, (6M7) 
Feller, (9) Flower, <10) and Todhunter. (17) 


8.2. Applications 

Of course, the modern range of applications of probabilities is more 
extensive. In matters of skill probability evaluations are usually given. For 
example, in baseball one has batting averages, fielding averages, earned run 
averages, etc. Reliability of manufactured products is expressed as the prob¬ 
ability of the product being free of defects. The biological models for heredit¬ 
ary, mutation, and survival are based on probability. The modern theory of 
statistics developed in association with experimentation in agriculture. 
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Testing drugs and medical procedures such as vaccination are also examples 
of applied statistics. 

Much of mathematical analysis is formulated in terms of a choice or 
strategy. In a maximum problem in the calculus one has a situation in which 
desirability is measured by the value of a function whose variable x can be 
specified. This can be considered a “one-person” game. In a two-person game 
the outcome, say, the amount won by one player, will usually depend on 
choices by both players. This is analogous to considering the outcome as a 
function of two variables, x and y, which the first player tries to maximize by 
his choice of x and which the second player tries to minimize by his choice of 
y. In a smooth case the play would correspond to a saddle point for the out¬ 
come function. When more than two players are involved, there is an addi¬ 
tional possibility for cooperation by a proper subset of the players. 

The theory of such strategies is called the theory of games. It has been 
extensively studied as a basis for models for economic theory. There is also 
considerable interests in winning strategies for recreational games (see 
Blackwell and Gershick,"’ von Neumann and Morganstern, <19) or Burger' 4 ’). 

The critical development of statistics occurred early in the twentieth 
century. This was one element in the tremendous increase in sophistication of 
experimentation in the natural sciences, physics, biology, and chemistry. 
Another aspect was the understanding of electromagnetism and light and the 
use of electromagnetic radiation. Improvements in instrumentation such as 
the diffraction spectroscope and electronic amplification were balanced by 
mathematical procedures using probability theory and integral transform 
analysis. In the 1950s, large-scale automatic electronic computation added 
new dimensions. The exploration of the physical structure of the cell and the 
chemical life processes was extremely effective. This utilized the electron 
microscope and X-ray diffraction analysis. The latter had been applied 
initially to inorganic crystals but was expanded by the use of Fourier series 
to yield the structure of remarkably complex organic molecules (see Bragg' 3 '). 

For further discussion of the material in this section, see Blackwell and 
Gershick, ’ Bragg, 13 ' Burger,' 4 ’ and von Neumann and Morgenstern.' 19 ’ 

8.3. Probability and Mechanics 

At first glance it may seem that a probability description of a toss of a 
coin and a dynamic description based on Newtonian physics are anti¬ 
thetical. One could apply either but not both. However, the two descriptions 
can be readily reconciled. When we toss the coin we do not precisely deter¬ 
mine the initial conditions of the motion. The outcome of the toss is clearly 
dependent on the state of motion and position of the coin as it leaves the 
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hand, since the motion after this point is clearly that of a rigid body. There 
may be some elastic reaction when landing. Presumably there is a probability 
distribution for these initial conditions, and a more complete mathematical 
description of the situation would include this probability distribution for 
the initial conditions as well as the dynamic description of the motion. Thus, 
the two aspects are complementary parts of a more complete mathematical 
theory. 

. This complementary characteristic is valid over a wide range. Suppose 
a system is governed by a Hamiltonian, 

H(p,q), Pi, q\ i=l,.... n, 

. dH .. 8H 
Pi dqt' q ~ dp 1 " 

Let n denote the differential Tlidpidq and let 7t, denote the result of omitting 
dpi and n‘ the result of omitting dq‘. Let © be a surface that bounds a region 
91 Then the rate at which the volume of ©changes is given by 


dv d cr " cc * 

2n lu = dt\\ ,? t fe*-9V)-JJ z 

-p (?«£')’ 

= IT f f (JOE _ 

JJJ , [dtfdpi dpidq 1 ) 71 


= 0 . 


Thus, the volume in the phase space {p, q} is invariant under the motion. 

This volume in phase space is not a probability despite its apparent 
measure character. It should be regarded as the equivalent to a count of 
possibilities or of “events” like the faces of a die. Since the total measure is 
infinite, one cannot take the probability as simply proportional to this 
measure. In modern developments this Liouville measure is replaced by 
discrete counts of possibilities, but even here these possibilities cannot be 
considered equally likely. 

The assignment of probabilities to the possibilities associated with a 
region in phase space at an instant of time is dependent on the availability of 
energy. Suppose we have a large number, N, of systems of the above type that 
are in thermal equilibrium with the rest of the universe, i.e., have a net zero 
exchange of energy with the remainder of the universe. Furthermore, these 
systems interact with each other only sporadically. 
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We divide the phase space of an individual system into a number of 
small regions, R„ each with approximately constant energy e f . We ask what 
is the probability that the N systems can be distributed among these small 
regions so that n, of them are in the region R,. 

To each region R, we assign a weight co h which we consider to be a 
count of the number of possible states that are in the region R,. This weight 
(Oi is proportional to the volume of phase space V, for R, in the case of a 
classical system. We assume that the volume h n corresponding to one state 
is so small that cu,= VJh" can be considered a relatively large integer. 

We make a number of other classical assumptions. We suppose that 
we can distinguish the individual systems so that the number of ways in which 
the N systems can be divided into sets with numbers rt u n 2 , ... is given by 
N\/n, !n 2 !.... Each of the n,- systems assigned to the R, region can be placed 
in c Oj different states, so that the number of possibilities for the proposed 
distribution is 

v(n,, n 2 , ...)= N\a>"a) 2 —/n l !n 2 !•••. 

If only one system can be assigned to a given state, is to be replaced by 
a),!/(<»,-«()!, but this will not make much difference if tu, is large relative to n t . 

But the energy equilibrium requirement states that only distributions 
«!, n 2 ,... with a given total energy Nl/=Z, should be considered and, of 
course, we must have fV = Z i n 1 . Subject to these restrictions we can regard 
all such possibilities as equally likely, and v(n,, n 2 ,...) is a relative (unnormal¬ 
ized) probability. Notice that suitable restrictions on the e, will insure that 
the total number of possibilities is finite. 

Given the values of N and U, the obvious objective is to normalize the 
relative probability v. However, the practical procedure in the “classical” 
case of (Oi large is governed by the fact that there is a most probable set of 
values tty n 2 , ... and that for N large the probability of a distribution with 
values significantly different from the most probable values is very small. 
Thus, the practical procedure consists in assuming that the N systems are 
distributed as they are in the most probably distribution. 

To determine the most probably set of values for the n u n 2 ,... one must 
maximize v(n,, n 2 ,...). Consider 

log v(/i,, n 2 , ...)=log N\— £ log n„! + £ n a log a> a . 

We will simply lump additive constants. By Stirling’s formula n’^n" 
exp (—n)/(27in) 1/2 , and 

log v =-£[(« - A) log n a -n a + n a log coj + C. 

a 

This expression must be maximized subject to the side conditions, I, n a =N, 
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I njz a =NU. This requires constant a and /? such that 


or 


d log v 
dtii 


-« + /*£, 


1/2 n ,—log n f + log <w, = —a + /?£,. 

Neglecting the term l/2n, we obtain, for A = e*. 

n, = (OiA exp (—/?£,). 

If we let F(/?) = Ia> ai exp(-/?£ a ), we obtain by adding these equations 
N = AF(p). Thus, 

n,_ qj,exp(—fe,) 

N F(p) 

This can be interpreted as a statement that the probability that a system be in 
the region R, is cy 1 exp(-/?£ i )/F(/?), with /? determined by the condition 

NU = Y, £«n« 


L r_y W exp (-/?£,) 
. F(P) 



log F(P). 


For further discussion of the material in this section, see Fowler, ,U) 
Mayer and Mayer,' 1151 and Schrodinger. <16) 


8.4. Relation to Thermodynamics 

With the above probability interpretation, U is the expected value 
of the internal energy of the system, and the immediate question is to identify 
other thermodynamic functions in terms of this probability picture. F is a 
function, F()S, £,,...), of P and the energy levels and 

d log F= —UdP—(p/F) X<y« exp (-/?£„) d£„ 

= -UdP-P^(nJN)dE a . 

Schrodinger interprets as the expected value of the work done 

on an individual system by external agencies to increase the average energy, 
i.e., E ( nJN)dE a = —dW, where W is the work done by the system. Thus, 

d log F+ U dp + p dU = p dU + P dW=p(dU+dW) 
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0 *> 




or 


d(\og F + PU)=P dQ= pT dS , 

where dQ is the heat flow into a system. This implies that for ip = log F + BU 
T is a function of S, S' = <£(S), and d<t>/ds = PT. 

Thus, log F+pu = (f>(S), and we must determine 4>(S). But in terms of 
systems, p is intensive and U and S are extensive functions. The usual dis¬ 
cussions of statistical mechanics now show that log F is an extensive function 
by considering the case of a system obtained by combining two subsystems. 
Hence, T and S are both extensive and consequently they are linearly related, 

i.e., /cT = S and PT=4>'= 1 /k or P= 1/kT. k is called the Boltzmann constant! 
We have, then, 

k\ogF=S-U/T. 

What is of some interest is to interpret S in terms of the probability 
picture. If n,, n 2 , ... is the most likely set of values, then v(n u n 2 , ...) is the 
number of ways of obtaining these values. In the classical case, using Stirling’s 
formula freely, we obtain, since I « a = N, I vj 4 = NU, 

log v = log N! + £ („ a log u) t - log n a !) 

= N log N-N + X'Ulogco.-log /Ja+l) 

= N log N + X M a (log OJ, - log n.) =N log N + Y.n a (—<x + pe a ) 

= N \ogN-Nix + PNU. 

But a = log A = log N - log F and thus 

log v=JV(log F+p(J)=NS/k 
or 

s =k( log F+pU)=(k log v)/N=k log ( v l/N ). 

S is therefore a measure of the number of possibilities for the system in the 
most probable case. Notice that the statistical argument yields a formula for 
F and that U and S are expressed in terms of this formula. 

As an example, consider an “ideal gas” as a collection of particles that 
undergo only elastic collisions in a spatial region 91 with volume V a . We ignore 
gravity. For a system consisting only of a single particle, the phase space is 
six-dimensional with coordinates x 1 , x 2 , x 3 , p u p 2 , p 3 , where (x 1 , x 2 , x 3 ) is a 
point of viand -oo <p t <oo. If/? is a small region in phase space containing 
the point x , x , x 3 , p u p 2 , p 3 and with volume dx l dx 2 dx 3 dp { dp 2 dp$ = dV, 
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then we can take e=(l/2mXp 2 + p 2 + P 3 ) and oo=dV/h i . Thus, 
h 3 F=Y, c\p (—pE x )h 3 (o t 

= (0 ex? ~ ^/2mXp? + P 2 + Pi)] dx 1 dx 2 dx 3 dp 1 dp 2 dp 3 




exp (-pp 2 /2m) dp 


= K a (m//3) 3 ' 2 exp (—u 2 /2) du 
= V a (2nm/P) 312 . 

log F = log V a -\ log P + C. 


Hence, 


This is log F for a single particle. Since log F is extensive, the F L for a gas 
consisting of L particles is given by 

log F l =L log F=L log V a -\L log P + LC. 

The terms in d log F L have been interpreted at the beginning of this section as 
d log F l = —U L dp + p dlV/Jor IV L , the work done by the gas,i.e., dW L =P L dV. 
Since log F L has been expressed as a function of P and V, 

U L = - ^ dog F l )=\L/P\ Pi.=~ ( log F l )=L/PV 

or U L =\kLT and p L V=kLT. If L 0 is the number of molecules in a mole of 
gas, then R = kL 0 . If p is the mole fraction represented by the given gas sample, 
the last equation becomes the familiar pV=pRT. The formula for U indicates 
that the specific heat at constant volume is %kL 0 = 3R. 

Thus, in the case of a ‘‘perfect gas” one has theoretically proven relations 
to use instead of empirical rules. This discussion is based on classical dyn¬ 
amics and a requirement that the cu, corresponding to a mesh interval R t 
represent a large number of possibilities. On the other hand, by making a 
correct choice of A, we obtain an absolute entropy as distinct from just a 
difference of entropy between two thermodynamic states. We also obtain 
an absolute energy U rather than simply energy differences. 

For further discussion of the material in this section, see Fowler 01) and 
Schrodinger. (16) 
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8.5. The Fine Structure of Matter 

The above procedure yields the Maxwell distribution for velocities. 
The more general classical statistical mechanics provided insight into kinetic 
processes in gases, chemical reactions, thermodynamics, and electromagnetic 
radiation. Thus, probability theory supplemented classical science to provide 
a much more extensive analysis of the behavior of matter. There was a cor¬ 
responding more sophisticated set of mathematical procedures that added 
probability concepts to the previous applied analysis of geometry, integra- 
tion, and partial differential equations. 

The very completeness of the theoretical picture represented by this 
formulation soon indicated inconsistencies. Radiation equilibrium was 
shown to be inconsistent with a continuous distribution of energy values, and 
experimental agreement was obtained only by assuming discrete energy 
levels. The discovery of the electron and the Rutherford structure of the atom 
indicated major discrepancies in the classical theory of electromagnetism 
and the specific heat of metals. The wave nature of light had been accepted 
because of interference phenomena, but the photoemission of electrons also 
indicated a particle nature. Furthermore, the electrons that had been con¬ 
sidered particles also exhibited interference effects as if they had a wave 
character. 

The resolution of these difficulties, which led to quantum mechanics, is 
described in readily accessible literature (see Born,* 2 * van der Waerden <18 ’). 
This was an extraordinary intellectual achievement that began in the nine¬ 
teenth century with an empirical mathematical description of the line spectra 
of certain chemical elements. In the twentieth century these results, relativity 
theory, and continuing experimentation structured a deepening under¬ 
standing of atomic phenomena. Their understanding in turn yielded an 
integrated viewpoint of physics, chemistry, and the phenomena associated 
with matter. 

The immediate objective was the interaction of matter and radiation, but 
intellectually the quantum mechanics should be considered as rectifying and 
completing the theoretical picture of the classical exact sciences. One major 
element of the previous situation was preserved. Macroscopic phenomena 
were represented as the probabilistic consequences of the action of a very 
large number of very small subsystems governed by chance. In general these 
subsystems are not independent in the probabilistic sense; for example, in 
solids they are spatially related in their actions. But more fundamentally, 
these subsystems cannot be individually identified, and a correct assignment 
of probability is obtained only by taking this into account. However, when 
this is done, notions of thermal equilibrium still have the same probabilistic 
character. 
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A new approach was required to describe the behavior of the micro¬ 
systems. The observable output was essentially probabilistic but quite 
different from that due to the “particle systems” of classical mechanics, where 
probability could be attached to distributions in phase space. In the early 
steps of the development of quantum mechanics, particle phase space was 
quantized, i.e., the continuous range was replaced by discrete levels corres¬ 
ponding to spectroscopic observations. But this in turn led to the use of the 
“state” of the subsystem as the fundamental concept in the mathematical 
theory. The state of the system has essentially a wave character, and the 
original particle description became a set of rules for determining the 
possibilities for states. The wave character of the state of a microsystem 
may be associated with a function defined on configuration space, i.e., the 
space of coordinates. Computational procedures were quite different and 
aimed at computing the probability of an event associated with the micro¬ 
system. 

This development had a very significant interaction with mathematics. 
The early formulations were efforts to adjust the classical Hamilton-Jacobi 
mechanics, electrodynamics, and probability theory to fit spectroscopic 
phenomena. The transition to wave mechanics was facilitated by known 
mathematical methods to solve partial differential equations that had led to 
“eigenvalue” problems, and these solutions are still of considerable practical 
importance. But a satisfactory logical treatment of quantum mechanics 
required more mathematically. The perturbation problem required a more 
abstract formulation in terms of linear operators. As a consequence, the 
spectral theory of self-adjoint operators in Hilbert space was developed by 
von Neumann and others. Dirac’s axiomatic treatment of quantum electro¬ 
dynamics incorporated a theory of the electron and positron based on rep¬ 
resentations of the Lorentz group. Lie group theory and the associated theory 
of Lie algebras had been developed in the nineteenth century in connection 
with the theory of differential equations. These now took on a much more 
abstract form, e.g., C* algebras, which permitted theoretical speculation 
concerning ultimate physical structures. 

Another mathematical development was that of distributions (see 
Guelfand and Chilov <12> ). The explicit description of distribution theory is in 
terms of linear functionals on spaces of functions, but it also can be con¬ 
sidered as a development of the theory of Fourier transforms. In recent times 
there has been a considerable reformulation of the analysis associated with 
partial differential equations in terms of distributions. This has been justified 
by the considerable extension of capability. 

Quantum mechanics produced a complete and satisfactory description 
of chemistry in the molecular sense and for radiation within a tremendous 
range of energy (see Herzberg (14) and Heit!er (13 >). This had many techno- 
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logical consequences, especially in electronics, after the Second World War. 
There was also a considerable understanding of nuclear structure, which 
produced nuclear weapons and an explanation of solar energy. 

For further discussion of the material in this section, see Born, (2) 
Condon and Shortley, (5) Dirac, (8) Guelfand and Chilov, (12) Heitler/ 13 * 
Herzberg, (l4) and van der Waerden. (18) 


8.6. Analysis 

The history of analysis should be associated with a precise exposi¬ 
tion of the subject itself, but some general comments may be desirable. 
Both Leibnitz and Newton stated the general principle of the integral calculus, 
which permits one to evaluate definite integrals by finding antiderivatives. 
Thus, one solves a differential equation instead of summing algebraically, the 
process that had been used in the seventeenth century. 

The Euler equations in the calculus of variation and integral relations 
such as Stokes' theorem or Green's theorem lead to differential equations 
either ordinary or partial. The procedure first used to solve differential 
equations was either infinite series or separation of variables followed by a 
series solution of the resulting ordinary differential equation. The converg¬ 
ence of infinite series was the first aspect of a more rigorous basis for analysis 
in which an axiomatic description of the Euclidean line corresponded to the 
system of real numbers. The corresponding geometric description of complex 
numbers as the Euclidean plane led to the theory of the analytic functions 
of a complex variable. 

Infinite series analysis yielded existence and uniqueness results “in 
the small," i.e., the domain of existence would be finite but subject to ad hoc 
estimates of size. To handle existence and uniqueness in the large, two essenti¬ 
ally different but complimentary approaches were used. One was the intro¬ 
duction of “topological" notions in the point set sense and the corresponding 
analysis of continuous functions and sequences of such functions. The other 
approach was that of infinite dimensional spaces in which a linear partial 
differential operator can be considered as a linear transformation. Each of 
these approaches involved a considerable amount of interacting develop¬ 
ment. For example, these required the extension of integration in the 
Lebesgue sense or in the Lebesgue-Stieltjes-Radon-Nikodym sense. 

The linear transformation theory of differential operators began with the 
investigation of linear differential equations with constant coefficients, and 
for these the use of Fourier transforms or Laplace transforms was particularly 
effective. The Sturm-Liouville theory indicated a beautiful analogy between 
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a more general class of differential operators and symmetric matrices in 
regard to characteristic values and vectors. Fredholm presented a precise 
basis for these indications by his analysis of the inverse integral operator. 
This in turn led to the introduction of Hilbert space by Hilbert. The spectral 
theory of self-adjoint operators has been further developed by Riesz, von 
Neumann, Carleman, Friedrichs, Stone, Lorch, and others using modern 
abstract concepts. 

• The “spectral theory" is applicable to “normal" operators, which one 
can think of intuitively as operators in which the characteristic vectors are 
mutually orthogonal. In the more general case the operator can be analyzed 
as a linear transformation in the sense of Banach from one linear space to 
another. The mapping characteristics of such a transformation are given by 
certain theorems of Banach and yield precise uniqueness and existence 
results. If the two linear spaces are Hilbert spaces a further analysis such as 
that given by the spectral theory is possible. 

Sophus Lie showed that the possibilities for continuous groups could 
be analyzed in terms of certain linear partial differential operators and an 
algebraic structure based on these. The notion of invariant measure on such 
groups has led to a theory of the representation of groups and algebras of 
linear operators on linear spaces. This last generalizes the notion of matrix 
algebras on finite dimensional linear spaces. Such algebras, for example, 
von Neumann algebras or W* or C* algebras, are considered to have a close 
conceptual relationship with quantum mechanics. 

Modern discussions of the structure of linear partial differential opera¬ 
tors usually involve the concept of a distribution. One describes a linear set of 
functions, the “base function," and the distributions are linear functionals on 
this base set, i.e., they are functions defined on this set that are linear. For 
example, if the base set O consists of functions </>, and/ is such that the integral 
exists for all 0 in 0, then I f (<t>) = Sf(j> dx is a linear functional on d> and hence a 
distribution. The power of this concept lies in the fact that the set of linear 
functionals for a given O is much more extensive than the well-formed integ¬ 
rals. Smoothness and integrability properties of the base set <J> can be associ¬ 
ated with the corresponding properties of the distributions. Base sets can be 
chosen so that analytic properties for the distributions hold, as do certain 
symmetries under the Fourier transforms. Since functions correspond to a 
subset of the distributions for a given base set, the problem of solving a partial 
differential operator equation Lu = v can be generalized to the case where u 
and v are distributions. This yields a very useful technique for linear partial 
differential operators using the distributions of the delta function and its 
derivatives. 
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Exercises 

8.1. Describe the Pascal triangle and show that the construction process yields the 
binomial coefficients. 

8.2. Show that the number of the ways in which k bins can contain n indistinguish¬ 
able objects is ("t-7 1 )- [Hint: Consider n + k — 1 positions in order on a line and locate 
“separators" at k -1 of these places and put objects in the remaining positions.] 

8.3. Evaluate the coefficient of x p in the expansion of 

(x -I- x 2 -f x 3 + x 4 + x 5 + x 6 )" 

for n^p^6n. 

8.4. What are the different forms of Stirling's formula and how are they proven? 

8.5. If x and y are independent random variables with distribution functions / (x) 
and g{y) y respectively, show that the distribution function for r = x + y is 



Show that if F(r), G(s), and H(t) are the respective Fourier transforms of j\ g, and //, then 
H(t)=F(t)G(t). 

8.6. Two players match coins of a fixed denomination until a player loses all his 
coins. Suppose A has 5 coins, B has 15 coins. What is the probability that B will win? 

8.7. Todhunter U7) (p. 295) expresses a result of Bayes as a ratio of integrals and 
states, “Bayes does not use this notation: areas of curves, according to the fashion of his 
time, occur instead of integrals." What is the definition of the integral that Todhunter 
uses in 1865? 

8.8. A game begins with n markers on the table. Each player lakes one or two 
markers from the table, and the winner is the one who takes the last marker on the table. 
Show that one can always win, provided one can assure that one’s opponent must make 
a choice from a number of markers that is a multiple of three. 

8.9. For the e, associated with regions K, in phase space, discuss the restrictions so 
that the total number of possible distributions {w,, n 2 > ...J is finite. 

8.10. Discuss the normalization process for the relative probabilities n 2 , ...). 

I n the sum of the permitted v, factor out the maximum term. This is analogous to certain 
discussions of the central limit theorem. 

8.11. Show that the entropy for a perfect gas contained in a volume V with molec¬ 
ules of mass m is 

S=pR [log V+\ log (27 wi//?)+C] 

=//K[log VT 3,2 +$\og(2nmk)+C] y 

where p is the mole fraction, that is, the number of molecules, L is pL 0 , and C is a 
constant independent of the choice of m. Notice the increase in entropy corresponding 
to an expansion of the gas into a vacuum of equal volume through a small aperture. The 
kinetic energy of the molecules will remain the same in such an expansion. 

8.12. Suppose that the N systems of Section 8.3 are indistinguishable but that 
only one system can be put in a given state. Then, in terms of the binomial coefficients, 

.. 

Assume to, is large. The most probable n, are given by the formula 
”, = tOiA exp (- /?£,)/[ 1 + A exp (- 0e,)], 
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where A and P are constants determined by /V = I n„ NU = I £,w,. The last equation may 
be replaced by an assumption that P is known. In this case, discuss how to determine A. 

8.13. If the N systems of the previous exercise are indistinguishable but any number 
can be put in a given state, then 

Assume at least to* > 1 and proceed as in the previous exercise. 

8.14. Show that if I x,/l, = 0 for every {xj,x 2 ,...} for which I x,B,=0and I x,C, = 0, 
then there exist constants a and P such that A { = xB, + /?C, for every i. (The {x|, x 2 ,...), 
are restricted to sequences in which only a finite number of terms are not zero.) 

8.15. Suppose T and S are functions of variables Xj.x„, with n^2 and </y = 

/(xi, ..., x„)dS. Show that y is a function of S, y = <£(S), with/ = d<p/dS. [Hint: Change 
variables to a new set u,,..., u n with u, = S.] 

8.16. Discuss the extensive character of the function log F relative to systems. 

8.17. In an elastic collision two particles interact so that the total momentum and 
the amount of kinetic energy associated with each direction in space are unchanged. 
Obtain the corresponding transformation in phase space and the determinant of this 
transformation. 

8.18. Show that a mole of one perfect gas differs from that of another perfect gas 
only in density. 

8.19. If the energy is only kinetic for a classical collection of systems and homogen¬ 
eous in the p, and there are m pairs [ p„ g, J, then one can extract P from the p, integral and 


log F= log V a - ™ log P + C. 


Find S and the p, V relation for an adiabatic change. 

8.20. Define a Hilbert space, SB,, whose elements arc functions defined for 0^x^2n 
and whose inner product is 



(fg+f'g')dx. 


How is completeness proven? 

8.21. Show that the sequence {e inx /[2(\ + n 2 )] l/2 } constitutes an orthonormal set for 
'IB,. This set is incomplete, since it is orthogonal to p(x) = sinh(x —7r)/(sinh27r) l/2 . If 
this function is added, show that one has a complete orthonormal set in sb,. 

8.22. If/(x)E IB,, show that 


(/. P)= [/<2n) -/(0)] cosh n/[sinh (2jr)] l/2 , 

and if/=(/ p)p + u, then u(27t)=u(0). A necessary and sufficient condition for/ to be 
expressible in 2B, in terms of the orthogonal set {e , "7[2(l +H 2 )] 1 2 } is that /(27t)=/(0). 
What does this mean concerning the term-by-term differentiation of Fourier series? 

8.23. Show that if g(t y x) is such that for each x, g x (t) = g(t , x) is in 5B ,, and such that 
for every/E (/, g x ) w =f(x\ then 


and 


g(t y x)=cosh t cosh (271— x)/sinh 27T for t^x 
g(t, x)=cosh (2 tt — f)cosh x/sinh 2 n for t >x. 


Note also that 

git , x) = sinh (x-7t)sinh (r — 7t)/sinh 2n + Y*e in{t - x) l2i[(\ + n 2 ). 
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i he function g(t x) is analogous to a “reproducing kernel.” 

)'g > 8 r< t^ ern * t: ,a! o^ - raiues'^n^^uch^th^TTWO.'lb) Given'- 11 ^ 

then 7* Cd has°a S r UCh ^ If 7 h3S a " adjoinl wilh dense domain "t* 

used to determine N? 86 * Wh0Se ortho 8 onal complement is N. How can this be' 

y a fmictbtnw^'iharT^^^^^hen °* 5,a ' ne ^ ^ ® nd ' n 8 for each value of 

(h ' u > )=iT f°' “r)=(/o. T*u r )=(f 0 , g( t , y))=/ 0 (y). 

Discuss this formal procedure. 

available! ill's = L'(0 CL 'corra^'T wercdt *P endenl on having T* 

Lf(x)=f{x)={f, 9X = j o IfMt. *)+/W(t, x)]rfr. 

Then 

(Lf, h ) 2 =(/, l*h) w for l*h =J y(t, x)h(x) dx. 

We have 


T=SL , r* = L*S*, 


and 


r **“/ 0 ^ , - x)S * /, Wdx=£ s; 0 (r, x>A(x)</x, 


formally. When are these manipulations justified? 
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The Paradox 


9.1. Intellectual Ramifications 

Mathematics has always been an essential technical element in civiliza¬ 
tion, but it also plays an intellectual role because of the unique character of 
its conceptual structure. Geometric and arithmetic ideas framed and sup¬ 
ported Babylonian astrology and as a consequence indicated order and 
inevitableness in human affairs. Classical geometry presented the original 
format for a rigorous logical discussion and shaped the whole concept of a 
philosophical point of view based on a specific set of principles from which 
all others are deduced. Problem mathematics is necessarily associated with 
the idea of human affairs based on a mutual understanding arrived at by 
logical means. 

The mathematics of the sixteenth and seventeenth centuries was more 
than a practical conjunction of geometry and problem mathematics. In¬ 
volved also were philosophical ideas of infinity and motion. But in addition 
there were intellectual imperatives for unity and a need for alternatives to 
philosophical systems that were inadequate for the extraordinary expansion 
of experience that was occurring. The opening up of a world of mathematical 
ideas complemented the telescope, the microscope, and geographical 
exploration. 

The natural philosophy of Newton offered an understanding of the 
solar system on a dynamic rather than purely geometric level and also 
opened up tremendous practical possibilities, for example, the use of mach¬ 
inery. This certainly altered notions of a homocentric universe. It resulted in a 
general conviction that everything we deal with could be explained by proper 
physical laws and mathematics. These physical laws were to be established 
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empirically by experience in the form of experiments, and many such laws 
were discovered relative to the elastic properties of gases and solids and the 
behavior of electricity. The developments in the eighteenth and nineteenth 
centuries consisted precisely in an expansion of both the known areas that 
were subject to empirical natural laws and the mathematics required to 
provide an effective theory. The basic conviction that there exist completely 
scientific explanations gradually became dominant. 

The mathematical nature of the theory was strongly inducive of efforts 
to establish unifying structures. This contributed to the growth and character 
of scientific theory. Mechanics was subjected to increasingly sophisticated 
formulas, general principles involving energy were applied to a wide range of 
phenomena, and chemistry developed both in terms of molecular structure 
and energy. Biology was also transformed from the classification concepts of 
Aristotle to a dynamic form in which the phenomenon of life is seen to be 
subject to an evolutionary principle stated in terms of probability and an 
explicit description of the mechanical and chemical behavior of the material 
associated with the living processes. Thus, conceptually the sciences were 
unified on a deeper theoretical and axiomatic level. 

These developments were completely integral with a tremendous expan¬ 
sion of mathematics. However, in the closing decades of the nineteenth 
century, mathematics entered a new phase. Intuitive elements in the con¬ 
ceptual structure were replaced by purely “logical” elements of axiomatic 
set theory. This new formulation contained isomorphic representations of 
the previously available analysis, and this corresponded to a more satisfac¬ 
tory rigorous development. A wealth of mathematical structures became 
available as a basis for scientific theories. Modern mathematics is inventive 
and permits setting up wide-ranging alternatives, and the choice must be 
narrowed by experimentation. This dual role is of course different from the 
situation in "natural philosophy” where geometry and the real numbers as 
ratios associated with the Euclidean line were part of a general axiomatic 
formulation involving all scientific magnitudes. 

Much of modern analysis is associated with the quantum mechanics. 
This provided a knowledge of the fine structure of matter, which was a proper 
supplement for the macroscopic description of the behavior of matter, 
electricity, and radiation. This then was the completion of the process of 
developing the scientific understanding of the environment and our usual 
experience. We have then a reasonably complete mathematical scientific 
theory of mechanics, chemistry, electricity, electromagnetism, and radiation. 

Included in this scientific picture was an increasing understanding of the 
chemical processes in living matter. The electron microscope revealed a 
reasonably complete physical structure for the cell. But within this structure, 
the living processes appear to be long and complicated sequences of chemical 
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reactions between extremely complex organic molecules. The shape and 
chemical constitution of these molecules have been explored by many tech¬ 
niques. providing information that combined with basic thermodynamics 
seems adequate to explain the chemical behavior both in the case of metabol¬ 
ism and the hereditary processes. It is true that the complexity of living 
phenomena prevents a complete detailed description at present, but the area 
covered is continuously increasing and the basic principles of “stereochemis¬ 
try” appear to be adequate. 

Thus, the mathematical description of experience, which began with the 
planetary tables of ancient Babylonia, expanded into a complete formulation 
that included even living phenomena. This was the objective of centuries of 
effort, which finally produced an overall theory that reached back a billion 
years to encompass the origin and evolution of life in a convincing prob¬ 
abilistic framework. The cosmic horizons widened far beyond the solar 
system and not only in space but in time, stretching back to the origin of the 
present universe fifteen billion years ago. Our capabilities limit the range of 
experience to which scientific understanding is applicable. But these capabil¬ 
ities are rapidly expanding and there is no hint in our present experience of 
any boundary that will not be crossed ultimately. 

But this completeness also implies exclusiveness. Since scientific under¬ 
standing encompasses all experience, all other understanding must be either 
derivative or illusory. Philosophy cannot dig deeper than an understanding 
of all experience. Philosophical “principles” such as the dialectic format of 
“thesis, antithesis, synthesis” must either be shown to be a consequence of 
science, possibly subject to limitations or admitted to be illusory. 

Certainly all the old animistic explanations of nature were fewept aside. 
The gods who lashed their steeds in the tempests and the angels who carried 
the planets like lamps across the night sky were just figures on a tapestry 
woven in the youth of the race to hide the black abyss of ignorance. In the 
theory of evolution one has a scientific depiction of life from its beginning as 
conglomerations of naturally produced molecules. This understanding has a 
precise mathematical description in terms of chemistry and probability and 
eliminates completely the concept of the intervention by a deity in successive 
steps. 

Indeed, the self-integrity of this understanding precludes the existence 
of God. He could not have created the world, since it has always been. 
Furthermore, any changes are accounted for by the development itself. God 
cannot introduce any change into this mathematically prescribed world 
without destroying it. And if God has no significance for our experience. He 
does not exist in any philosophically reasonable sense. 

For further discussion, of the material in this section, see Gatlin* u and 
von Neumann.* 61 
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9.2. The Paradox 

Thus science has reached its natural pinnacle, displacing philosophy and 
religion. The ultimate basis of government decisions is science or at least a 
claim to be scientific. Any proposed public policy is justified or opposed on 
scientific grounds, and morality is modified, gradually, by what is scientific¬ 
ally available. Scientific terms and formats are glibly used far in excess of any 
appreciation of the disciplined understanding they are supposed to represent. 

But there is one difficulty. Our proof that God does not exist also shows 
that we don’t either. For how can we interfere in the determinate scientific 
world if God can’t? The concept of an experiment in an empirical science 
involves the idea of a directed observation in which the pattern of effect 
is preset by our actions. This is emphasized by the use of instruments. The 
point is that an experiment requires directed observation—an action on our 
part, not a casual observation triggered by circumstances. In many experi¬ 
ments our interference in the environment extends to isolating some of it 
from part of its past history so that a hypothesis can be tested. Thus, we have 
contravened the deterministic development in order to increase our under¬ 
standing. Indeed, the whole point of understanding is to intervene in the 
environment. Nevertheless, empirical sciences produce a mathematically 
determinate picture that includes the behavior of the material in our body. 
The use of probability does not alter this situation or permit animistic inter¬ 
vention as is clearly evident in the theory of evolution. Transition probabil¬ 
ities lead to mathematical predictions with no animistic aspect, just as 
Newton’s laws do. Any animistic element is inconsistent with a probability 
description, i.e., the latter would be false if such an element is present. (See 
Murray. (4) ) 

Thus, we have two alternatives. We can assume that we are capable of 
introducing variations into the environment that are not mathematical 
consequences of the past history. Or we can assume that we are automatons 
following present patterns but subject to the illusion that we can intervene. 
We will call these, respectively, the “action hypothesis” and the “automaton 
hypothesis.” 

The “action hypothesis” is consistent with our usual intuitive interpreta¬ 
tion of our relation with the environment. We consider ourselves to be entities 
that are recipients of a continuous stream of impulses from the environment, 
which we integrate in our mind into an awareness of the world. Much of 
the integrand is from the past, i.e., stored, and we believe that we can detach 
this awareness from the present and substitute stored elements at will. Thus, 
awareness can roam in our imagination as we direct. Most of us believe that 
we can use this imagination to make plans, and furthermore, that we can 
direct our bodies to take action based on these plans. 
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Our awareness presents us as an entity immersed in the environment. 
In this environment we are aware of a natural stream of activity that would 
occur if we did nothing. We expect this activity to be scientifically determinate 
if no human intervention occurs. However, we can and do intervene in this 
natural stream of activity to direct it more favorably for ourselves. The 
problem is how, if our bodies are part of an essentially determinate environ¬ 
ment, can these independent actions occur. Thus, E. Schrodinger in What Is 
Lifel iS) (p. 85) states, 

(i) My body functions as a pure mechanism according to the laws of nature. 

(ii) Yet I know by incontrovertible direct experience, that I am directing its motion. 

There are serious difficulties in making this picture consistent. One basic 
physical principle requires that action always be associated with a configura¬ 
tion of energy—i.e., a ball rolls downhill or the planets move in the solar 
system—and there is the question of how such a configuration can be arbit¬ 
rarily introduced, since such an introduction is itself an action. It is an obvious 
aspect of controlled action that it corresponds to a cascaded sequence of 
actions, each of which triggers the next, and the mutual relations involved 
here offer no difficulty. But the initiation of the sequence appears to contra¬ 
dict one or the other of our basic notions. 

The action theory is also called, pejoratively, “dualism.” Any acceptance 
of an “animistic explanation” is certainly contrary to the direction of scien¬ 
tific development over many centuries. No direct positive evidence other 
than our subjective impressions is available to support this view. 

The other solution is to assume that our ability to interfere in the world 
is really an illusion. Our organic structure is the result of a long period of 
evolution in which patterns favorable to survival were impressed on organ¬ 
isms by natural selection. These include patterns of behavior and the capabil¬ 
ity, like a computer, to store a summary of past events and to process them. 
The computer analogy here is critical, since we know that the computer does 
function as part of the deterministic environment and we can apply the 
evolutionary concept of natural selection both to the “program” of the 
computer and to the apparatus for gathering and storing data. Thus, our 
supposed ability to interfere in the environment is really an illusion. Our 
behavior, which produces this illusion, consists of transforming the data of 
past experience into rules governing our present activity. It is believed that 
procedures corresponding to this description can be set up in a computer, 
i.e., “artificial intelligence,” “adaptive programming.” 

There are technical and practical difficulties associated with the pro¬ 
gramming techniques mentioned. An effective automaton must interact 
with its environment by receiving impulses from it and reacting. The compu¬ 
ter, which must make the appropriate tie-in between input and output, deals 
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only with the symbols associated with information and action. The actual 
effective information for which a symbol is to be entered may correspond to a 
considerable abstraction from the received set of impulses. One hopes the 
desired symbol is a function of the set of impulses received, but determining 
this function, the problem of “pattern recognition” is frequently difficult. 
Similarly, the action required may have a simple symbolic representation 
that is well understood, but the action itself may have to be resolved into a 
complex combination of specific motions that must be individually realized. 

The proponents of this theory write as if these difficulties are the only 
issues involved and use an anthropomorphic language, using terms such as 
“sensing” and “decision.” For example, the computer process of computing a 
function with discrete values is a “decision.” This automaton model is 
fundamentally inadequate, and there are obvious difficulties with this 
approach, which has received widespread and complacent acceptance. The 
automaton is scientifically determinate, yet the scientific theories on which it 
is based involve experimentation containing directed and controlled actions 
for the purpose of observation. If this control and direction is an illusion, then 
the scientific theories lack an experimental basis. 

Obviously, in an automaton there is no need for awareness. A computer 
does not have awareness. Since the actions of the automaton are predeter¬ 
mined, our awareness is a completely ineffective adjunct. Our decisions are 
determined for us by the overall development of events. Thus, awareness 
represents a duality in which it has no meaning, since awareness can be 
effective only by controlling action. This dualism is even more objectionable 
than the preceding. 


9.3. Final Comment 

Trapped by political considerations into committing what he knew to be 
a grave injustice, Pontius Pilate asked, "What is truth ? This reflected a 
certain familiarity with philosophy as it was taught in the ancient universities. 
The f -;ademic world no longer explicitly asks, "What is truth?” or “What is 
God or “What is mathematics?” The professional developers of knowledge 
deal with more sophisticated concerns whose significance in each case is 
apparent only to a small circle of cognoscenti. 

The years have witnessed the growth of this vast coral reef of knowledge, 
and professional advancement in the universities has become associated 
with this growth. But most students in the university consider the objective 
of their education as their personal development, and there are incompatible 
elements in these objectives. The student with facility in certain intellectual 
exercises is frequently enticed into a complete acceptance of the supremacy of 
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knowledge. The academic profession recruits itself from these. This produces 
a subculture with extremely separatist tendencies. The reaction against this 
is perhaps even more unfortunate, the belief that action is incompatible with 
intellectual development. 

This may be the most critical aspect of our culture. Underneath many 
of the tendencies of our times are the moving forces of this split. Men of 
power, desirous of action, have developed an impatience with intellectual 
understanding, and movements that appeal to the academically oriented, 
such as “saving the environment ” easily assume the objective of stopping all 
action. 

The student may find his opportunities rather unpleasantly restricted 
by this schism. Nevertheless there is available to him a great hoard of 
intellectual treasures, and his immediate concern should be not with such 
questions as "What is truth?” or “What is mathematics?” or even “What can 
I do?”, but with “What can I do with my mind?” 


Exercises 

9.1. Consider the relationship of the information concept considered in the Gatlin"’ 

book and the “information" in a blueprint. The latter should yield a structure when 
properly interpreted. How is the “information of the living system to be decoded and 
how does this relate the individual to his environment ? Does the environment decode the 
message? How is this discussion related to the subject of the von Neumann book 
How is a living organism a "history of two billion years ? u ,, . , 

9.2. Discuss the responses of Heisenberg.’ 2 ’ Schrodinger.' 5 ’ and Whitehead 
(Joad 13 ’, Chapter XX, introduces the ideas of Whitehead and gives further references) 
to the paradox of Section 9.2. 
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